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Preface 



Multibody Dynamics is an exciting area of applied and computational mechanics, 
whose substantial progress during the last fiveS decades has stemmed from the 
rapid and simultaneous development of many technological disciplines like robotics, 
spacecraft and machine design, and was stimulated by the advances in computa- 
tional techniques. In order to deliver methods and tools for the modeling, analysis 
and simulation of complex mechanical systems, various topics were merged in the 
field, including contact and impact, control and mechatronics, real-time simula- 
tion, optimization, flexibility, time integration schemes and software development. 
The current area of interest include robotics and walking machines, road and rail- 
way vehicle dynamics, aerospace, biomechanics, and many other multidisciplinary 
applications. 

The ECCOMAS Thematic Conference on Multibody Dynamics was initiated in 
Lisbon in 2003, and then continued in Madrid (2005) and Milan (2007), aimed at 
providing a venue for exchanging ideas and recent developments related to the the- 
ory and applications of multibody systems. The fourth edition of the Conference was 
held at the Warsaw University of Technology, Warsaw, Poland, from June 29 to July 
2, 2009. At the Conference participated 219 researches from 27 countries, mainly 
from Europe (162), but also from Asia (40), and North (13) and South America 
(4). They presented 167 technical papers, having an excellent forum for discussion 
and technical exchange on the most recent advances in the rapidly growing field of 
Multibody Dynamics. 

The present book is a collection of revised and extended versions of 1 5 papers 
presented at the Conference, recommended by the Session Organizers for pub- 
lication in this post-conference book. The general selection criterion was that 
the papers best reflect the state-of-art of the topics associated to the particular 
sessions, and cover the areas of biomechanics (Raison et al.), contact dynamics 
(Flores et al./Ziegler and Eberhard), control, mechatronics and robotics (Iwamura 
et al./Seifried), efficient methods and real-time applications (Cavagna et al./Pfau 
and Schaden), flexible multibody dynamics (Ambrosio et al./Dibold and Gerst- 
mayr), formulations and numerical methods (Garcia Orden and Aguilera/Schindler 
et al.), miscellaneous multibody applications (Fr^czek and Wojtyra), optimization 



vi Preface 

(Brills et al.), software development, validation, and education (Tasora et al.), and 
vehicle systems (Bottasso et al.)- We hope you will find the reading of this collection 
enjoyable and stimulating. 

March 2010 Wojciech Blajer, Krzysztof Arczewski 

RadomAVarsaw Janusz Fr^czek, and Marek Wojtyra 
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A Flexible Multibody Pantograph Model 
for the Analysis of the Catenary-Pantograph 
Contact 

Jorge Ambrosio, Frederico Rauter, Joao Pombo, and Manuel S. Pereira 



Abstract The pantograph-catenary system is still the most reliable form of 
collecting electric energy for running trains. This system should ideally run with 
relatively low contact forces, in order to minimize wear and damage of the con- 
tacting elements but without contact loss to avoid power supply interruption and 
electric arching. However, the quality of the pantograph-catenary contact may be 
affected by operational conditions, defects on the overhead equipment, environ- 
mental conditions or by the flexibility of the pantograph components. In this work 
a flexible multibody methodology based on the use of the mean-axis conditions, as 
reference conditions, mode component synthesis, as a form of reducing the number 
of generalized coordinates of the system and virtual bodies, as a methodology to 
allow the use of all kinematic joints available for multibody modeling and applica- 
tion of external forces, are used to allow building the flexible multibody pantograph 
models. The catenary model is built in a linear finite element code developed in a 
Matlab environment, which is co-simulated with the multibody code to represent 
the complete system interaction. A thorough description of rigid-flexible multibody 
pantograph models is presented in a way that the proposed methodology can be 
used. Several flexible multibody models of the pantograph are described and pro- 
posed and the quality of the pantograph-catenary contact is analyzed and discussed 
in face of the flexibility of the overhead components. 



1 Introduction 

The interaction between the pantograph and the catenary is one of the factors 
that limits the operating speed of railway vehicles and, consequently, is one 
of the research priorities in the European railway community. These limitations 
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concern not only the ability to collect energy at high operating speeds but also 
the interoperability between the overhead equipment in trains and infrastructure. 
From the mechanical point of view, the most important feature of the pantograph- 
catenary system consists on the contact quality between the registration strips of 
the pantograph and the contact wires of the catenary. The system must ideally run 
with relatively low contact forces, to minimize wear and damage of the contact 
elements, but with high enough contact forces to prevent contact loss, to ensure a 
constant power supply and minimize the occurrence of electric arching. The design 
of pantograph-catenary systems aims at controlling the interaction phenomena 
maintaining the contact forces within an acceptable operational envelope. Among 
the factors that affect the quality of the pantograph-catenary contact are those con- 
cerned with the defects on the catenary or pantograph, environmental conditions, 
such as wind [1,2] and extreme temperatures [3], running dynamics of the railway 
vehicle and the deformability of the pantograph mechanical system. This work 
proposes the use of flexible multibody methodologies to describe the pantograph 
system and the co-simulation of the models obtained with detailed finite element 
models of the catenary to evaluate the quality of the overhead contact and to identify 
the main mechanical issues influencing it. 

Some of the earlier works in flexible multibody systems use fixed reference 
frames to describe the small elastic deformations given by the finite element method 
of planar mechanical systems [4]. This methodology effectively coupled the rigid 
body motion and the small deformations. To be able to analyze complex shaped 
flexible multibody systems, Shabana and Wehage proposed the use of substructur- 
ing and the model component synthesis method to reduce the number of generalized 
coordinates required to represent the flexible components [5, 6]. Reduction of the 
dimension of the flexible multibody problem is achieved by choosing only a small 
number of suitable vibration modes. In most cases only a small number of natural 
modes of vibration are needed, namely those related to the lower natural frequen- 
cies of the structure. The static correction modes represent the typical response of a 
structure subject to given boundary conditions [7]. Criteria to estimate the number 
of modes of vibration of each type have been proposed and proven to be successful 
for low velocity systems [8]. 

The coupling terms is dependent of the type of finite elements used in the model 
and involve the derivation of the element shape functions, which are not available in 
finite element literature [9, 10]. To enable the general use of finite element types in 
the analysis of flexible multibody systems a lumped mass formulation, based on the 
diagonalization of the mass matrix preserving the rotational inertia, is used [11]. 

For problems in which the flexible bodies experience nonlinear deformation the 
approach must be different and based on large displacements and rotations theory 
[12-14]. For nonlinear problems Kane and coworkers propose a nonlinear theory 
that includes the dynamic stiffening [15]. In the line of work developed by the finite 
element community [16-18], Cardona and Geradin propose a formulation for the 
nonlinear flexible bodies using either exact geometrical models or substructuring 
[19,20]. An approach based on the use of the finite rotations nodal coordinates 
enabling the capture of the geometric nonlinear deformations has been proposed 
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and named absolute nodal coordinates [21]. A different approach has been proposed 
to model geometric nonlinear deformations based on relaxing the need to exhibit 
small moderate rotations about the reference frame, by using an incremental finite- 
element approach within the flexible body description [22]. The extent of the use of 
the referred approach to model material nonlinearities has also been proposed [23]. 
In these formulations the rigid body motion and the elastic variables are expressed 
in inertia reference frames, where the deformation state is derived with respect to 
local reference frames and a relation deformation-displacement is obtained but, due 
to the inherent nonlinearity of the deformations, the problem cannot be reduced 
implying the handling of large system matrices during the analysis process. Because 
the deformations observed in the pantograph or on the catenary are small, only linear 
elastodynamics is considered in the flexible multibody models used. In any case, 
structural damping is used in order to improve the time integration [24]. 

The use of finite element method on the framework of flexible multibody dynam- 
ics implies the definition of a set of reference conditions. Straightforward reference 
conditions are the body fixed reference frames, where the frame is attached to one 
or more nodes of the flexible body, constraining at least six degrees of freedom 
[11]. The mean-axis reference conditions correspond to a different approach by 
introducing a floating frame defined to minimize the kinetic energy associated to 
the deformation, measured with respect to an observer stationed on the flexible 
body [25, 26]. Another type of floating reference frame is called the principal axis 
reference conditions where the origin of the reference frame is associated to the in- 
stantaneous center of mass and its directions to the principal inertia directions [27]. 
Augusta and Ambrosio analyze and point out the major advantages and drawbacks 
of the different methodologies and their main applications [28] . The mean-axis con- 
ditions are used here as reference conditions for the flexible multibody formulation. 

To be able to use the extensive library of kinematic constraints developed for 
rigid multibody systems with flexible bodies, the concept of virtual bodies has been 
developed [29, 30]. The numerical efficiency of this methodology applied to com- 
plex structures was shown using a sparse matrix solver after comparison to that of 
multibody models with custom developed flexible kinematic joints [31]. Kinematic 
joints based on the use of the virtual body approach [32] are implemented in this 
work so that the pantograph model can include the type of kinematic joints particu- 
lar to its construction, described in reference [32]. 

Previous studies of the catenary-pantograph interaction emphasized not only the 
mechanical aspects of construction, operation and maintenance but also the chal- 
lenges of its numerical simulation due to the multi-physics characteristic of this 
problem. Ranging from simple linear catenary models using 2D finite elements 
[33], or having the catenary represented by cables and loaded by a lumped mass 
model of a pantograph [34] several approaches proposed kept the problem simple 
enough to tackle it by a single code. In a similar line of work Dahlberg describes 
the contact wire as an axially loaded beam and uses modal analysis to represent 
its deflection when subjected to transversal and axial loads, showing in the pro- 
cess its relation to the critical velocity of the pantograph [35]. Labergri presents 
a very thorough description of the pantograph-catenary system that includes a 2D 
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finite element model for the catenary and a multibody pantograph, being the contact 
treated by unilateral constraints [36]. In all works mentioned it is claimed that the 
catenary structural deformations are basically linear and consequently the catenary 
systems are modeled using linear finite elements. The slacking of droppers is an 
exception being handled as a nonlinear effect. Another approach is proposed by Seo 
and coworkers that states the need to treat the catenaries as being nonlinear due to 
their large deformations [37, 38]. The large deformation of the catenary is modeled 
using the three-dimensional finite element absolute nodal coordinate formulation 
while the pantograph is a full 3D multibody model. The interaction between the 
pantograph and the catenary is modeled by a sliding joint that allows for the motion 
of the pan-head on the catenary cable and no contact loss is represented [38]. Arnold 
and Simeon address the pantograph-catenary interaction as time dependent prob- 
lems coupled by constraints on boundaries [39]. A half-explicit integration method 
reversible in time was also developed in order to preserve as much information as 
possible during time discretization. Due to the multi-physics problem involved in 
the catenary-pantograph system, Veitl and Arnold proposed a co-simulation strat- 
egy between the code PROS A, where a catenary is described by the finite difference 
method and the SIMPACK commercial multibody code, used to simulate the pan- 
tograph [40]. Several strategies tackling the co-simulation problem, such as gluing 
algorithms proposed by Hulbert and coworkers [41] or the co-simulation procedures 
suggested by Kubler and Schiehlen [42]. Recognizing that the finite element method 
is appropriate to model in detail the catenary and that the multibody dynamics ap- 
proach is suited to handle the pantograph dynamics, a co-simulation approach using 
two separate codes is proposed in this work. 



2 Flexible Multibody Systems 

2.1 Flexible Body Equations of Motion 

For the flexible body depicted by Fig. 1 let q, = [qj^ u''^] . be the vector of gener- 
alized coordinates of body i, where q^ = \yJ p^^] represents the translational and 
rotational position of body i local coordinate system (^, r], i,)i and vector u' repre- 
sents body / elastic coordinates. The flexible body equations of motion are obtained 
by Gon9alves and Ambrosio as [11]. 
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where the mass matrix M, contains the mass, inertia tensor and inertia coupling 
terms, vector s, represents the velocity quadratic terms and other acceleration in- 
dependent terms, g,- is the generalized external force vector, and K,- is the finite 
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Fig. 1 Flexible body i 




elements stiffness matrix. The mass matrix in Eq. (1) may be either consistent or 
lumped. In order to maintain the inertia coupling terms independent of the finite 
element shape functions, the lumped mass formulation is used in this work [11]. 

The equations of motion obtained, using consistent or diagonalized mass ma- 
trices, do not have a unique displacement field. It is necessary to impose a set of 
reference conditions to eliminate the rigid body modes and provide the unique dis- 
placement field of the flexible body. In general, reference conditions are written as 
kinematic constraints that relate the independent and the dependent elastic coordi- 
nates. The mean axis conditions constraints are such that enforce the local frame 
(^, r], ^)i of body i to follow the motion of the nodes in such a way that the kinetic 
energy associated with the deformation corresponds to a minimum value for an ob- 
server stationary in the body local frame [25, 26]. The deformation kinetic energy 
of a flexible body can be expressed in terms of the generalized elastic coordinates 
with respect to the local coordinate system (^, rj, ^),- as: 
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where the nodal translation velocities are denoted by S^^. and the nodal angular ve- 
locities by Qj^. The generalized elastic coordinate velocities u^ of a node k of the 
body mesh are written in terms of generalized set of coordinates q^ of the body as: 
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in which matrix A represents the transformation matrix from the body local coordi- 
nate system to the inertial frame. Minimizing the deformation kinetic energy of the 
body with respect to the translational and rotational velocities leads to 
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Substituting Eq. (3) into Eq. (4) results in the velocity constraint equations that 
define the mean axis reference conditions, as 



$ = 



E '^kh 
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(5) 



The velocity constraint equations may be written in more compact form as: 
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(6) 



where $^7" represents the Jacobian matrix of the mean axis reference conditions 
constraint equations, which is explicitly written as 
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The time derivative of Eq. (6) results in the acceleration constraint equations of the 
mean axis reference conditions, written here in a compact form as 



^(ma) ^ ^(^ma)^, ^ ^(ma) 



(8) 



The constraints associated to the mean axis conditions are imposed on the flexible 
body equations of motion, described by Eq. (1), leading to 
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Note that the mean axis conditions are non-holonomic constraint conditions and can 
only be defined at the velocity and acceleration levels. 

The flexible body equations of motion, shown in Eq. (9), include a very large 
number of generalized coordinates, leading to a computational expensive proce- 
dure. For linear elastic small deformations, as those experienced by the pantograph 
components in the applications foreseen in this work, it is possible to represent the 
deformation of the flexible body as a sum of deformation modes that are constant 
in time. Let those deformation modes be the modes of vibration associated to the 
natural frequencies of the flexible body. The generalized elastic coordinates of body 
;' are now described by a weighted sum of these modes as 



Xw 



(10) 
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where w represents the contributions of the modes of vibration towards the nodal 
displacements and X the modal matrix containing a selected number of modes of 
vibration x, that are obtained by solving the eigenproblem: 



KffXi = MiMffXi 



(11) 



The solution of Eq. ( 1 1 ) is independent of the reference conditions used to constrain 
the rigid body movement of the elastic coordinates. Therefore the modes of vibration 
obtained correspond to those of a structure free in space, defined as free-free modes. 
The vibration modes obtained related to the first six lowest frequencies, generally 
null, represent the rigid body motion of the flexible body. These modes of vibration 
are removed from the modal matrix. A simpler system of equations is obtained by 
normalizing x, with respect to the mass matrix M^ 



X^M^X 
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(12) 
(13) 



where A is a diagonal matrix containing the square the natural frequencies associ- 
ated to each mode of vibration. 

By substituting Eq. (10), and its time derivatives, into Eq. (9), pre-multiplying the 
second row by X and using the relations described by Eqs. (12) and (13) leads to: 
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The number of generalized elastic coordinates used in Eq. (14) is equal to the num- 
ber of vibration modes included in the modal matrix, thus being possible to reduce 
considerably the problem dimension considering a general use of flexible multibody 
models. The effects of local deformations induced by high concentrated loads origi- 
nated, for example, by kinematic constraint reaction forces or other force elements, 
can also be included in the modal synthesis using of static correction modes [7, 8]. 



2.2 Kinematic Joints with Virtual Bodies 



The use of flexible bodies requires that the kinematic joints implemented in the 
multibody code are re-written again for the new set of generalized coordinates used. 
A form to circumvent this difficulty is to use the concept of virtual bodies introduced 
by Bae et al. [30] and further developed by Ambrosio and Gonfalves [31,32]. With 
the virtual body approach, a rigid joint between a flexible body and a rigid body 
is derived for a node k of the mesh of the flexible body and the origin of the vir- 
tual rigid body fixed reference frame, as depicted in Figs. 2 and 3. Afterwards, any 
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Fig. 2 Rigid joint between a flexible body i and a massless rigid body j : (a) virtual body; (b) nodal 
and body fixed coordinate systems 
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Fig. 3 External forces applied on a flexible body: (a) Forces applied directly on the nodes; 
(b) FORCES applied on the virtual body attached to the finite element nodes 



kinematic joint between the flexible body and other bodies of the system is estab- 
lished using the virtual body instead, making possible to use any of the kinematic 
joints available on the multibody code library. 

The constraint equations for the rigid joint are defined for the translational and 
rotational parts of the constraint independently. Let a spherical joint be defined be- 
tween node k of flexible body i and a point P of the rigid body j , coincident to the 
origin of the body fixed coordinate system. This is the translation part of the rigid 
kinematic joint written as 



,(0 



4>('> ^rj-ri-Aibk=0 



(15) 



In order to define the rotational part of the rigid joint let a coordinate system 
(?. '?. Ok be attached to node k, as showed in Fig. 3b. The nodal frame is defined 
by unit vectors Cyt = [ eJ e? e? ] =1 initially parallel to the flexible body / local 
reference frame (^, rj. ^), unit vectors e', . Unit vectors defining both nodal and body 
coordinate frames are expressed in the inertial frame (X. Y, Z) as: 



ek = AiAktk 



(16) 
(17) 
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where A^ = I + G^ represents the nodal rotational matrix for small rotations. The 
rotational part of the rigid joint constraint enforces that the relative orientation be- 
tween the node and virtual body reference frames remains invariant, i.e. 

$W = (AiAkC'^)'^ Ajt'j - r' =0; with(m,/) = (2, 1),(3, 1),(2,3) (18) 

in which only three equations are defined, corresponding to the independent rota- 
tional constraints. Constants /J™' are related to the initial angle between the axis of 
the coordinate systems in the undeformed state. 

The external applied forces on the flexible bodies are applied to the nodes of the 
finite element model, as shown in Fig. 3a. Assume that force f,; and a moment n, , 
shown in Fig. 3a, are applied in node k of the flexible body. Then, introducing a 
virtual body rigidly attached to that node allows for the direct applications of these 
forces on the center of mass, as shown in Fig. 3b. Note that the use of the virtual 
body approach allows for setting rigid joints with more than one node at a time. This 
approach can be used to setup complex interaction conditions between the flexible 
body and the external environment. 



3 Co-simulation of Multibody and Finite Element Codes 

The fundamental element of the co-simulation between a finite element code, de- 
nominated by EUROPACAS-FE, and the multibody code, herein denominated by 
EUROPACAS-MB, is the contact module between the two subsystems. The contact 
force due to pantograph-catenary interaction is characterized by a high-frequency 
oscillating force with high relative amplitude. Railway industry measurement data 
shows that reasonable values for the contact force are, for a train running at approx- 
imately 80m/s: a mean value of 200 N oscillating between 400 and 100 N. Loss of 
contact in particular points of the catenary may also occur. Therefore impact effects 
must be included in the model. 

Most continuous force contact force models have similar features, i.e., they eval- 
uate the contact force as a function of a pseudo-penetration between two elements 
and a proportionality factor often designated as stiffness of the contact elements. 
The contact model used here, proposed by Lankarani and Nikravesh [43], is of the 
Hertzian type and includes internal damping and relates the normal contact force /„ 
with the penetration between two rigid bodies S by 



fn = KS" 



3(1 -e2) S 

1 + — -^- 



(19) 



where K is the generalized stiffness, e is the restitution coefficient, S is the relative 
penetration velocity and S^~^ is the relative impact velocity. Factor K is obtained 
from the Hertz contact theory as the external contact between two cylinders with 
perpendicular axis. 
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The issue of the co-simulation is now 'reduced' to be able to find the state 
variables of the finite element catenary model and of the multibody pantograph 
model at the same instants of time, so that the contact force and its application 
points can be evaluated and used in the equations of motion of each subsystem. 



3.1 Integration of the Finite Elements Equations of Motion 

The motion of the catenary is characterized by small rotations and small defor- 
mations, in which the only nonlinear effect is the slacking of the droppers, being 
typically modeled with linear finite elements. All catenary elements, contact and 
messenger wires are modeled by using Euler-Bernoulli beams. Using the finite ele- 
ment method, the equilibrium equations for the structural system are [44] 

Ma-FCv-l-Kx = f (20) 

where M, C and K are the finite element global mass, damping and stiffness ma- 
trices of the finite element model of the catenary, not to be confused with the finite 
element models of the flexible bodies used for the pantograph. The nodal displace- 
ments vector is x while v is the vector of nodal velocities, a is the vector of nodal 
accelerations and f is the vector with the applied forces. Equation (20) is solved for 
X or for a depending on the integration method used. 

In this work the integration of the nodal accelerations uses a Newmark family 
integration algorithm [45]. The contact forces are evaluated for f -I- Af based on the 
position and velocity predictions for the FE mesh and on the pantograph predicted 
position and velocity. The finite element mesh accelerations are calculated by 

(M + yMC + pAt^K) St+A, = (t+At - Cv^+Af - Kdj+A. (21) 

Predictions for new positions and velocities of the nodal coordinates of the linear 
finite element model of the catenary are found as 

Ar2 
dt+A, =dt + Aty, + —(1- IP) at (22a) 

y,+A, = y, + At{\-y)af (22b) 

Then, with the acceleration Zt+At the positions and velocities of the finite elements 
at time t + At Eire corrected by 

6.t+At = dt+At + PAt^at+At (23a) 

Vf+Ar = Vr+A( + yAtat+At- (23b) 

This procedure is repeated until convergence is reached for a given time step. 



A Flexible Multibody Pantograph Model 

3.2 Integration of the Multibody Equations of Motion 
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The forward dynamic analysis of a multibody system requires that the position vec- 
tor q^ and the velocity vector q'' are given. The multibody equations of motion 
assembled and solved for the unknown accelerations, which are in turn integrated 
in time together with the velocities. Due to the long periods of analysis and to the 
structure of the equilibrium equations not only the stabilization of the integration 
must be insured but also the constraint violations must be eliminated. In this work, 
the Baumgarte constraint stabilization method is used to stabilize the multibody sys- 
tem equations of motion and the Coordinate Partition Method is used to correct the 
position and velocity constraint equations when the violations exceed a prescribed 
acceptable tolerance [46], as depicted in Fig. 4. 

The pantograph-catenary system is characterized by an intermittence of the 
contact between the contact wire of the catenary and the registration strip of the pan- 
tograph. The numerical methods used for the dynamic simulation must be able to 
represent the loss and start of contact. This fact puts particular restrictions on the 
numerical integration algorithms for both pantograph and catenary with particular 
emphasis on the time step size selection. The multibody code used for the panto- 
graph dynamics, considered here, uses as a Gear multi-step multi-order integration 
algorithm [47,48]. 



3.3 Co-simulation Using Different Codes 

The analysis of the pantograph-catenary interaction is done by two indepen- 
dent codes, the pantograph code, EUROPACAS-MB, which uses a multibody 
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Fig. 4 Flowchart representing the forward dynamic analysis of a multibody system implemented 
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Fig. 5 Structure of the communication scheme between the MB and the FE codes 



formulation, and the catenary code, EUROPACAS-FE that is a finite element soft- 
ware. Both programs can work as stand-alone codes. The EUROPACAS-MB code 
provides the EUROPACAS-FE code with the positions and velocities of the pan- 
tographs registration strips. EUROPACAS-FE calculates the contact force, using 
the contact model represented by Eq. (19), and the location of the application points 
in the pantographs and catenary, using geometric interference. These forces are 
applied to the catenary, in the finite element code, and to the pantograph model, in 
the MB code, as implied in Fig. 5. Each code handles separately the equations of 
motion of each sub-system based on the shared force information. 

The compatibility between the two integration algorithms imposes that the state 
variables of the two subsystems are readily available during the integration time but 
also that a reliable prediction of the contact forces is also available at any given 
time step. Several strategies can be envisaged to tackle this co-simulation problem 
such as the gluing algorithms proposed by Hulbert et al. [41] or the co-simulation 
procedures suggested by Kubler and Schiehlen [42]. The key of the synchronization 
procedure between the MB and FE codes is the time integration, which must be such 
that it is ensured the correct dynamic analysis of the pantograph-catenary system, 
including the loss and regain of contact. Let it be assumed that the FE integration 
code is of the Newmark family and has a constant time step. Moreover, let it be 
assumed that the time step of the FE is small enough not only to assure the stability 
of the integration of the catenary but also to be able to capture the initiation of 
the contact between the pantograph registration strip and the contact wire of the 
catenary. The only restriction that is imposed in the integration algorithm of the 
multibody code is that its time step cannot exceed the time step of the FE code. 
Finally let it be assumed that both codes can start independently from each other, i.e., 
the catenary FE model and the pantograph MB model include the initial conditions 
for the start of the analysis expressed in terms of the initial positions and velocities 
of all components of the systems. A fully integrated communication interface is 
implemented according to the two stages represented in Fig. 6. 

Initially, when both codes exchange input data information, it is necessary to 
perform initialization procedures, while, after, data is shared during the dynamic 
analysis [49,50]. The EUROPACAS-MB code provides the EUROPACAS-FE code 
with the information about the number of contacting bodies in the model, and their 
initial position and velocity. Subsequently, the EUROPACAS-FE code provides the 
MB code with information about the initial and final analysis time and the time step 
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FE Communication Flowchart 
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Fig. 6 Communication procedure during the dynamic analysis 



to be used in the FE analysis. Note that nowhere in the communication procedure 
outlined it is implied what kind of integration algorithm is used for the FE catenary 
analysis, provided that it is a fixed time step integrator. Even this condition can be 
relaxed, but it would not have any practical implication as it is not usual that FE 
dynamic analysis is performed with variable time step integrators. 
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4 Analysis of the Pantograph-Catenary Contact Problem 
4.1 Pantograph Multibody Models 

The flexibility of a pantograph is described by the experimental modes of vibration 
shown in Table 1 for the CX Pantograph [51]. The modal data acquisition is obtained 
by imposing a cyclical force to the pantograph head of constant value with frequen- 
cies ranging from to 200 Hz. It is observed that important natural frequencies exist 
in the pantograph within the range of the operating frequencies of the overhead elec- 
tric collection system, justifying that a flexible multibody approach is used to model 
the pantograph. 

Several models of the pantograph, shown in Fig. 7, are modeled using a rigid- 
flexible multibody approach to evaluate their influence on the quality of the 
pantograph-catenary contact. The lower and top arms are steel tubular structures 
with varying cross-section, whereas the pantograph head is composed by steel, 
composite materials and carbon registration strips. Although highly detailed FEM 
models may be derived using solid and shell finite elements, simplified models of 
the referred bodies are used as local effect analysis as stress and strain analysis 
are not required. The mechanical data for the pantograph top arm and for its finite 
element model is shown in Table 2. 

The modes of vibration of the top arm FEM model are obtained for free-free con- 
figuration, i.e., the model free in space. The first six structural natural frequencies 
are shown in Table 3. 

The pantograph head, shown in Fig. 8, is another component for which the flex- 
ibility is expected to play a role. Its structure is composed by several elements with 



Table 1 Representative experimental modal ba.sis of the CX Pantograph 



Mode Frequency 

n° (Hz) Description 



Mode Frequency 

n° (Hz) Description 



1 



11.0 



40.1 



49.3 



Rotational 

movement 

of the 

main frame 

around base 

(Z) 
Bending of 

the lower 

link (Y) 



Bending of 
the top 
arm (Z) 



- Ir-T-^" 



X 




19.3 



45.7 



71.1 



^■^mM-— 



Bending of 
the top 
link (Y) 



Bending 
of the 
panto- 
graph head 

(Z) 

Bending of 
the top 
link (Z) 




^---^ 



^"U^- 



^ 
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Fig. 7 CX pantograph used 
for the rigid-flexible 
multibody models 



Top arm 




Lower arm 



Table 2 Material and geometric data for the upper arm 
and finite element information 



Mechanical data 


Unit 


Value 


Mass 


[kg] 


15.6 


Young Modulus 


[GPa] 


206.8 


Density 


[kg/m^] 


7,820 


Average length 


[m] 


1.36 


FEM model data 




Value 


Finite element 


[type] 


3D elastic beam 


Cross-section 


[type] 


Circular hollow 


Cross-section area 


[m^] 


6.9 X lO"* 


Finite element 




42 


Nodes 




43 



different materials, including steel, for the support structures, carbon strips, for the 
registration strips, and composite materials, for the aerodynamic elements. 

As the first mode of vibration of the pantograph head is a flexion mode, the 
structure may be modeled in a simple and straightforward way as a straight beam. 
The FEIVI model used is composed by a collection of beam elements, with rectan- 
gular cross-section and two lumped masses at the end-points of the straight beam, 
with the general characteristics shown in Table 4. The modes of vibration, for the 
free-free configuration, are described in Table 5. 

In order to appraise the influence of the flexibility of the top arm to the global dy- 
namic behavior of the pantograph in the rigid-flexible pantograph multibody model 
only the top arm is described as a flexible body. Figure 9 shows a representation of 
this multibody model, referred to as pantograph model 2. 

By considering the top arm as a flexible body, four virtual bodies are added to 
the multibody model to allow for the definition of kinematic joints. In Table 6, the 
main characteristics of the flexible multibody model are described. 

The characteristics of the rigid bodies used in the multibody model are presented 
in Table 7 for the fully rigid multibody model. Note that the top arm is flexible and, 
consequently, the mass and inertia is not explicitly given as input data. The virtual 
bodies shown in Table 8, and added to the rigid multibody model, are in the points 
of the top arm involving kinematic joints. 
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Fig. 8 Pantograph head 
representation 
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Structure 



Table 4 Pantograph head 
FEM model characteristics 



FEM model data 


Unit 


Value 


Finite element 




3D elastic beam 


Cross-section 




Rectangular 


Cross-section area 


[m^l 


5 X 10"* 


Finite element 




2 


Finite element 




Lumped mass 


Finite element 




2 


Nodes 




3 



Table 5 Natural frequencies and modes of vibration of the pantograph head FEM model 



Mode Frequency 

n° (Hz) Description 



Mode Frequency 

n° (Hz) Description 



1 



49.6 



173.9 



398.7 



First bending 
moment X 



Second bending 
moment X 



Second 

rotation X 




335.0 



701.0 



Rotation X 



First bending 
moment Z 



Translation -|- 
bending X 



The virtual bodies, 8 through 1 1 , are linked to the flexible bodies through the rigid 
kinematic joints and to the rigid bodies through standard kinematic joints. Table 9 
presents the definition of the rigid kinematic joints for the rigid-flexible pantograph 
multibody model 2. Note that the rigid-flexible joints rigidly attach a node of the 
flexible body mesh to a virtual body. Consequently, the mesh of the flexible body 
must be generated in such a way that at the least one node is included at each point 
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RP-1 



SJ-2 




Fig. 9 Pantograph model 2 with a flexible top arm 



Table 6 Characteristics 
of the pantograph multibody 
model 2 



Multibody model data 



Number 



Rigid bodies 
Virtual bodies 


6 
4 


Flexible bodies 


1 


Rigid kinematic joints 
Rigid-flexible kinematic joints 


8 

3 



Table 7 Rigid body data of the pantograph multibody model 2 









Inertia properties 


Initial 


Initial 




Rigid body 


Mass (kg) 


(kg.m^) 


position (m) 

-vo/.vo/zo 


orientation 


ID 


ei/ej/cj 


1 


Pantograph 
base 


32.65 


2.76/4.87/2.31 


0.00/0.00/0.00 


0.00/0.00/0.00 


2 


Lower arm 


32.18 


0.31/10.43/10.65 


-0.57/0.00/0.41 


0.00/0.17/0.00 


3 


Upper arm 


15.6 


0.15/7.76/7.86 


-0.39/0.00/1.06 


0.00/-0. 18/0.00 


4 


Lower link 


3.10 


0.05/0.46/0.46 


-0.89/0.00/0.28 


0.00/0.21/0.00 


5 


Upper link 


1.15 


0.05/0.48/0.48 


-0.36/0.00/1.00 


0.00/-0. 16/0.00 


6 


Stab, arm 


1.51 


0.07/0.05/0.07 


0.55/0.00/1.42 


0.00/0.00/0.00 


7 


Panto, head 


9.50 


1.59/0.21/1.78 


0.55/0.00/1.51 


0.00/0.00/0.00 



to which a kinematic joint is attached. The linear force elements are detailed in 
Table 10. The force exerted by the air pump, located between the lower arm and the 
base of the pantograph, is represented as a constant moment n,, = 440 Nm applied 
to the lower arm. 
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Table 8 Data for the virtual bodies used in the flexible pantograph multibody model 2 



Inertia properties Initial 
(kg.m^) position (m) 



Initial 
orientation 



ID Rigid body Mass (kg) I^g/I„„/lK 



xo/yo/zo 



eiMM 



10 



II 



'irtual 


0.0 


body 




irtual 


0.0 


body 




irtual 


0.0 


body 




irtual 


0.0 


body 





0.0/0.0/0.0 -1.19/0.00/-0.13 0.00/-0. 1 8/0.00 

0.0/0.0/0.0 1.01/-0.31/0.00 0.00/-0. 18/0.00 

0.0/0.0/0.0 -1.01/0.00/0.00 0.00/-0. 18/0.00 

0.0/0.0/0.0 1.01/0.31/0.00 0.00/-0. 18/0.00 



Table 9 Definition of the kinematic joints used in flexible multibody model 2 

Connected bodies Attachment points Local coordinates (m) 



ID 


Kinematic joint 


i 


J 


Body;(^,/r;,/?) 


Body;(^,/;7,/?,) 


I 


Revolute joint 


1 


2 


(0.02/0.00/0.13)p 
(0.02/1.00/0.13)e 


(0.82/0.00/0.00)p 
(0.82/1.00/0.00)2 


2 


Revolute joint 


2 


10 


(-0.82/0.00/0.00)p 
(-0.82/1.00/0.00)(5 


(0.00/0.00/0.00) p 
(0.00/1.00/0.00)2 


3 


Revolute joint 


11 


6 


(0.00/0.00/0.00) p 
(0.00/1.00/0.00)2 


(0.00/0.3 l/0.00)p 
(0.00/1.31/0.00)2 


4 


Spherical joint 


1 


4 


(-0.26/0.00/0.00) p 


(0.69/0.00/0.00)p 


5 


Spherical joint 


8 


4 


(0.00/0.00/0.00) p 


(-0.62/0.00/-0.03)p 


6 


Spherical joint 


2 


5 


(-0.78/0.00/0.00) p 


(-I.OO/0.00/0.00)p 


7 


Spherical joint 


5 


6 


(0.96/0.00/0.00)p 


(0.00/0.00/- 1. 05) p 


8 


Spherical joint 


9 


6 


(0.00/0.00/0.00) p 


(0.00/0.00/- 1.05)p 


9 


Rev. -prism, joint 


6 


7 


(0.00/0.34/0.00)p 
(0.00/0.34/1.00)2 


(0.00/0.34/0.00)p 
(0.00/1.34/0.00)2 



Table 10 Linear force elements data for the flexible multibody model 2 



Spring elements Bodies 

Linear Damping 

force Stiffness Length coefficient 

ID element (N/m) (m) (N.s/m) ; j 



Attach points local coord (m) 



^,h,n. 



^>hilKi 



1 Spring- 2,000.00 9.06 

damper 

2 Spring- 3,600.00 0.12 

damper 

3 Spring- 3,600.00 0.12 

damper 



3,000.00 I 2 

13.00 6 7 
13.00 6 7 



0.28/0.00/0.09 0.82/0.00/-0.05 
0.00/0.34/0.00 0.00/0.34/0.00 
0.00/-0.34/0.00 0.00/-0.34/0.00 



Another rigid-flexible pantograph multibody model, depicted as model 4 in 
Fig. 10 and described in Table 1 1, is used here considering the flexibility of the pan- 
tograph head only. The objective of this model is to understand the influence to the 
pantograph head on the quality of contact. 
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Fig. 10 Pantograph model 4 with flexible head 



Table 11 Characteristics of 
the flexible pantograph head 


Multibody model data 




Number 


Rigid bodies 




6 




Virtual bodies 




4 




Flexible bodies 




1 




Rigid kinematic joints 




8 




Rigid-flexible kinematic 


joints 


4 



Table 12 Data for the virtual bodies in the rigid-flexible pantograph multibody model 4 



Inertia properties 
(kg.m^) 



Initial 
position (m) 



Initial 
orientation 



ID 


Rigid body 


Mass (kg) 


hi;/ki/ki: 


xo/yo/zo 


ei/czM 


8 


Virtual body 


0.0 


0.0/0.0/0.0 


0.55/0.34/1.51 


0.00/0.00/0.00 


9 


Virtual body 


0.0 


0.0/0.0/0.0 


0.55/-0.34/1.51 


0.00/0.00/0.00 


10 


Virtual body 


0.0 


0.0/0.0/0.0 


0.55/0.34/1.51 


0.00/0.00/0.00 


11 


Virtual body 


0.0 


0.0/0.0/0.0 


0.55/-0.34/1.51 


0.00/0.00/0.00 



In order to establish the kinematic constraints between the stabilization arm 
and the flexible pantograph head, four virtual bodies are used to establish a 
revolute-prismatic joint, to apply spring-dampers and to allow the application 
of the contact force. The positions of the virtual bodies, at the attachment locations 
of the kinematic joints, are shown in Table 12. 

The virtual bodies are also used in order to handle all interactions between 
the flexible body and the surrounding environment, including forces generated by 
spring-damper elements or by contact. The rigid multibody model linear force el- 
ement between the stabilization arm and the pantograph head are now set between 
the stabilization arm and virtual bodies. 
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Fig. 11 Representation of a SNCF 25 kV suspended catenary 
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4.2 Catenary Finite Element Model 

Catenaries are complex periodical structures, such as those presented in Fig. 1 1 . 
Examples of typical structural elements involved in the catenary model are the con- 
tact, stitch and messenger wires, droppers and registration arms. Depending on the 
catenary system there are other elements that may have to be considered. In any 
case, the contact wire is the responsible for the contact between catenary and panto- 
graph and therefore the element that provides electrical power. The messenger wire 
prevents excessive sag caused by the contact wire weight. Both of these wires are 
connected by vertical, tensile force droppers. 

Even in a single European country there are different types of catenaries in use 
with different particularities in their construction. The contact wire is typically 
characterized by a small cross-section, compared to its length, being primarily sus- 
pended at the masts. Depending on the topology of the track and on the exposure to 
transversal winds the masts are placed at a distance of 27-63 m from each other. To 
maintain a constant mechanical stiffness of the contact wire a set of elements are 
designed to suspend the contact wire at these locations, specific of each suspended 
catenary type. In the French 25 kV catenary represented in Fig. 1 1 the contact wire 
is suspended by a low inertial elemental called the steady arm which is linked to the 
registration arm. The latter is suspended with respect to the messenger wire by the 
stitch wire and is connected by a hinge to the mast. This solution aims at limiting 
the dynamic coupling between the contact wire and the supporting elements. To 
minimize the spatial curve described by the contact wire and to maximize the wave 
propagation velocity of the contact wire a static load is applied to its extremities. 
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If seen from the top, the contact wire is suspended forming a zigzag around the 
longitudinal direction, designated by stagger. This geometric characteristic of the 
suspended catenary enables a constant wear of the pantograph registration strip. 



4.3 Simulation Scenario and Results 

To be able to understand the influence of the flexibility of the structural elements 
of the pantograph on the contact dynamics of the pantograph-catenary interface a 
single pantograph scenario is analyzed. The scenario corresponds to a single panto- 
graph system attached to a railway vehicle running at approximately 300 km/h on a 
straight track, as depicted in Fig. 12. 

The flexible multibody model 2 allows the analysis of the deformation of the 
upper arm. As expected the deformation is described by the bending modes of vi- 
bration. The results depicted in Fig. 13 show that the dominant mode of vibration on 
the pantograph top arm behavior is the first bending mode. 

The bending of the upper arm results in lowering the position of the contact points 
with the pantograph head, as depicted in Fig. 14. However, the differences observed 
on the contact kinematics are not reflected on the contact forces, which are similar 
for the rigid and flexible models as seen in Fig. 15. 

The analysis of the influence of pantograph head deformation in the contact force 
generated due to the pantograph-catenary interaction is analyzed also. Disregarding 
the deformation of the main frame can be, it is possible to understand the influence 
of the flexibility of the pantograph head by modeling the pantograph using the flex- 
ible multibody model 4. As depicted in Fig. 16, the first and second bending modes 
contribute to the deformation of the pantograph head. The deformation of the pan- 
tograph head has a very small influence on the contact forces, as seen in Fig. 17. 

Although, for operational conditions and considering the present catenary model 
the influence of the deformation of the pantograph head may be disregarded with- 
out loss of accuracy, its influence is important to develop an actively controlled 
pantograph. Another aspect not studied in this work is the effect of the pantograph 
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Fig. 12 Scenario for a high-speed train equipped with a pantograph running on a tangent track 
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Fig. 13 Modal contribution of upper arm modes of vibration to the dynamic response of model 2 
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Fig. 14 Vertical position of the catenaiy contact point on the registration strip - Model 2 
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Fig. 15 Contact force (filtered at 20 Hz) for the rigid and flexible pantograph model 2 
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Fig. 16 Modal contributions of the pantograph head deformation for model 4 
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Fig. 17 Contact force results (filtered at 20 Hz) for the flexible pantograph model 4 versus rigid 
pantograph model, for the scenario of a train equipped with a single pantograph 



flexibility in extreme conditions or wiien excited close to its natural frequencies, due 
to operational or defect conditions. It is expected that, under these conditions, the 
flexibility of the pantograph components cannot be disregarded. 



5 Conclusions 



The development of flexible multibody models of the pantograph is achieved in a 
straightforward way using the rigid multibody model as a base. The implementation 
of the virtual bodies methodology allows the definition of standard rigid kinematic 
joints, force elements or external applied non-linear forces to link flexible bod- 
ies. A pantograph model is devised testing the influence of the deformation of the 
most relevant bodies to the global dynamic behavior of the pantograph due to the 
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pantograph-catenary interaction. The deformation of main frame of the pantograph 
that reacts to low frequency solicitations of the contact force does not influence 
the numerical results. The kinematic relations of the main-frame lead to a cancel- 
ing effect of the deformation of the lower arm and of the top arm. Furthermore, as 
expected, the pantograph head does not show a high level deformation, thus not in- 
fluencing the contact force results, at least when comparing with the effects of other 
perturbations. It can be stated that for non-perturbed scenarios, for a pantograph 
running on a straight track at 300 km/h it is possible to disregard the effects due 
to the flexibility of the bodies. Nevertheless the use of flexible multibody models 
of the pantograph is important to be able to simulate the dynamic response of the 
pantograph-catenary system to defects, for example. 
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Maneuvering Multibody Dynamics: 
New Developments for Models with Fast 
Solution Scales and Pilot-in-the-Loop Effects 

Carlo L. Bottasso, Giorgio Maisano, and Francesco Scorcelletti 



Abstract The present paper focuses on trajectory optimization problems for 
multibody vehicle models, accounting for the presence of pilot-in-the-loop effects 
and fast dynamic components in the solution. The trajectory optimal control prob- 
lem is solved through a direct approach by means of a novel hybrid single-multiple 
shooting method. Specific focus of the present work is the inclusion of pilot mod- 
els in the optimization process, in order to improve the fidelity of the solution by 
considering the entire coupled human-vehicle system. In particular we investigate a 
series of maneuvers flown with helicopters, quantifying the performance loss due to 
human limitations of the pilot-vehicle system with respect to the sole vehicle case. 



1 Introduction 

The ability to simulate maneuvers of rotorcraft vehicles flying at the boundaries of 
their operating envelope is a valuable asset for performance analysis, handling qual- 
ities research, design and certification, pilot training, and support to the flight test 
activity. In general the maneuver of interest can be fully described in terms of quan- 
tities which should be minimized of maximized, subject to a variety of equality and 
inequality constraints [8-13]. Hence, one can usually give a precise mathematical 
definition of a maneuver by formulating an equivalent optimal control problem. The 
formulation of such a problem necessitates of a model of the vehicle system with its 
inputs, states and outputs, of a cost function and of a list of all constraints. 
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Clearly, the fidelity of the predictions made using this approach crucially hinges 
on the fidelity of the vehicle model. On the one hand, fidelity improvements may be 
obtained by considering a more sophisticated description of the vehicle; the current 
state-of-the-art calls for first-principle multibody models of the vehicle, coupling 
structural, fluid and servo fields. On the other hand, one might clearly consider 
the inclusion of a model of the pilot. In fact, in the absence of a pilot model, the 
solution of a trajectory optimization problem amounts to finding the limit perfor- 
mance trajectory flyable by a "perfect" pilot. In reality, the pilot is a complex system 
which can be modeled so as to account for sensory perceptions, learned behavior 
and biomechanical properties. Therefore, it is reasonable to assume that a maneuver 
optimized considering just a flight mechanics model of the vehicle will in general 
tend to overestimate the vehicle performance, as this has been computed without ac- 
counting for the limitations of various nature of a real pilot. To verify whether this 
is indeed the case, the present work tries to quantify this hypothesized performance 
loss due to the inclusion of a pilot model in the trajectory optimization process. 

A human pilot model should account for various effects: 

• Sensorial perception: the sensorial system of the pilot provides for a perception 
of movements, body position, accelerations, vibrations, etc., which enable the 
pilot to build a representation of the current situation. 

• Control behavior: the pilot, based on the input provided by the sensorial infor- 
mation, evaluates the situation and, on the basis of a desired goal, elaborates a 
control law based on experience and training. 

• Command actuation: the neuro-musculoskeletal system of the pilot acts like an 
actuator that takes as input the control law and translates it into movements of 
the vehicle controls (collective, cyclics, pedal). 

In the literature, there is a wide range of pilot models which have been formulated 
for different applications. As suggested in [21], pilot models can be subdivided in 
the following categories: 

• Crossover Model: a basic model for single-axis tracking tasks, which is use- 
ful for tuning more complete models. In the region of the open-loop crossover 
frequency, the product of the pilot transfer function and that of the vehicle is 
approximated as an integrator with time delay [26] . 

• Isomorphic Models: all models which try to explicitly approximate the dy- 
namics of the human sensory and control systems. The Structural Model offers 
a simplified structural representation of the pilot dynamics in compensatory 
systems [20, 29]. Particular emphasis is given to sensorial feedback, which 
typically includes proprioceptive and vestibular feedbacks, while the neuro- 
muscular components of the model is approximated with a second order filter. 
The Biophysical Models give more emphasis to the dynamics of the pilot neu- 
romuscular system [27]. Finally, Biodynamic Models are based on multibody 
dynamics approaches [22], and are used for investigating the effects of an accel- 
erating/vibrating environment on the pilot control capabilities. 

• Algorithmic Models: models whose principal focus is the control behavior of 
the pilot, but which may include some isomorphism achieving a good degree of 
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completeness. A typical example of this category is the Optimal Control Model, 
which considers the human pilot as an optimal controller [17], and where the 
sensorial component is taken into account by using a Kalman filter. 
• Behavioral Models: models which consider the human pilot as a black box 
with nonlinear behavior. There are two principal approaches in this category: 
Fuzzy-Logic Models, which are based on fuzzy-set theory describing cause-and- 
effect relationships [25, 29], and Neural Network Models, which rely upon the 
capabilities of neural networks of accurately describing nonlinear input-output 
relationships, mapping pilot cues into control tasks [23]. 

Clearly, the most appropriate choice of a pilot model is strictly related to the par- 
ticular application considered. In the framework of trajectory optimization, we need 
to account for all three aspects listed above, namely sensorial perception, control be- 
havior, and command actuation. Furthermore, it would be preferable to work with 
a model formulated in state space form, so as to ease its integration in the overall 
maneuver optimal control problem. 

The sensorial perception can be modeled by formulating appropriate observers, 
for example using Kalman filtering [17]. As a first step towards the goals set forth 
in this study, we have neglected this aspect of the problem in the present paper, al- 
though we plan on considering it in the continuation of this activity. In fact, although 
the inclusion of an observer in the maneuver optimal control problem formulation 
does not pose conceptual difficulties, we have postponed the modeling of this com- 
ponent of the pilot system because of the difficulty in finding data for the tuning of 
the filters. 

The second aspect of the modeling, i.e. the control behavior, is in part already in- 
cluded in the formulation of a maneuver optimal control problem. In fact, the pilot 
elaborates a control law based on desired goals and constraints, which are in fact 
the very cost function and constraints which enter into the definition of the optimal 
control problem. However, some aspects of the control behavior are more subtle and 
difficult to model, such as for example the skills and experience of the pilot. Such 
effects are hard to model in precise mathematical terms, but we speculate here that 
they might be rendered through appropriate modifications of the cost function. For 
example, the modeling of piloting skills might account for degraded piloting be- 
havior for maneuvers which require increased coordination and activity among the 
controls (increased workload) [15]. Such effects are easily included in the proposed 
maneuver optimal control approach, since the coding requires trivial modifications 
to the cost function routines. Nonetheless, specific experimental data are lacking, so 
that even in this case we have not considered these aspects in the present work, while 
waiting to perform experiments with pilots in a simulator to gather the observations 
necessary for the tuning of such models. Therefore, in this paper the control behav- 
ior is translated in the choice of a cost function that includes a problem-dependent 
goal quantity (e.g. altitude loss, time, etc.), and a control term which penalizes ex- 
cessive control activity and/or excessive control rates; specific details on the choice 
of the cost functions are given below in the section of the paper devoted to the ap- 
plications. Such modeling, although rather simple, probably captures a significant, 
and possibly the most significant, part of the pilot behavior. 
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The third aspect of the problem, the command actuation, can be modeled in a 
variety of ways. The more sophisticated approach is based on first-principle model- 
ing of the musculoskeletal system using multibody dynamics, and typically includes 
rigid bodies with their inertial parameters, joints, muscles with their mechanical and 
physiological properties, interactional forces with the environment, and other com- 
ponents as required for the accurate representation of the real bio-system. Such level 
of detail is probably not necessary for capturing the effects of the limitations of the 
bio-system on the vehicle flight mechanics performance. Hence, a simpler approach 
is used here, where the effects of the musculoskeletal system are rendered in a global 
equivalent sense through the use of simple delay and filter models, as detailed below. 

There are two principal approaches to the solution of trajectory optimization 
problem: indirect [16,28] and direct methods [5-7, 10, 13]. Following our previ- 
ous work [1 1], we prefer the direct approach even for the applications which are the 
focus of the present paper. In fact, in the case of the indirect methods one has first 
to derive the optimal control governing equations by using the calculus of varia- 
tion, and then numerically solve the arising two-point boundary value problem. The 
manipulation of the vehicle equations of motion for deriving the optimal control 
governing equations makes it very hard or inefficient, if not altogether impossible, 
to use black-box flight simulators, where more often than not one does not have 
access to the source code. In the case of coupled vehicle-pilot models, the equations 
tend to become even more involved, so that here again the use of a direct approach 
allows for a simpler implementation. In fact, the direct approach does not require 
any manipulation of the equations, as one first discretizes the problem by time step- 
ping (using either a transcription or a shooting method [11]) and then solves the 
resulting Non-Linear Programming (NLP) problem by a standard solver, such as 
SQP (Sequential Quadratic Programming). 

Multibody vehicle models of rotorcraft systems include both slow flight 
mechanics scales and faster aero-elastic ones [14]. To treat more effectively this 
class of optimal control problems of multibody systems, we use multiple shoot- 
ing on the flight mechanics scales, and single shooting on the faster aero-elastic 
ones; this avoids the enforcement of the multiple shooting gluing constraints for 
the faster scales, which greatly enhances convergence and in turn reduces the 
computational cost. 

The paper is organized according to the following plan. At first, we describe the 
pilot model considered in this work and we present the equations of the coupled 
pilot-vehicle system. Secondly, we formulate in general terms the trajectory op- 
timization problem. A discussion about the possible numerical solution strategies 
to solve this problem are given next; namely, we first describe the direct tran- 
scription approach and then we present the direct hybrid single-multiple shooting 
method. Finally, we investigate a number of maneuvering rotorcraft problems, and 
we assess the pilot-in-the-loop effects on the computed limit performance of the 
vehicle. 
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2 Coupled Pilot- Vehicle Model 
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As argued in the introduction, enriching a vehicle model by adding a pilot model is 
a way to improve the performance predictions made using trajectory optimization. 
The main task of a pilot is to govern the vehicle by deciding a suitable control law 
in relation with the maneuver goals, based on the current perception of the situation 
as provided by his/her sensory system. The optimal control model proposed in [24] 
and revisited in [17] is a possible way of rendering these effects. In the present work 
we adopt a similar approach, reformulating it in the context of trajectory optimiza- 
tion. This way, the decision level control behavior of the pilot can be considered as 
embedded in the objective function of the maneuver optimal control problem. 

The two remaining aspects of human pilot limitations are due to sensorial per- 
ceptions and command actuation. As a first step towards the more ambitious goal of 
a complete pilot modeling system, we consider here a simple actuator pilot model 
(Fig. 1), in order to assess its impact on the vehicle performance predictions, as well 
as on the computational cost and robustness of the numerical procedures. 

The vehicle equations of motion can be expressed as 



f{x,x,uj) = 0, 



(1) 



where x are the flight mechanics states, and u the vehicle control inputs (collective, 
longitudinal and lateral cyclics, and pedal). More in general, rotorcraft multibody 
models are described in terms of differential-algebraic equations which also include 
Lagrange multipliers and constraint equations; however, in the sole interest of a 
lighter notation and simpler discussion, we consider in the following the ordinary 
differential set of equations expressed by (1). 

The pilot actuator system is modeled using a pure time delay [17], operating 
in series with a second order filter for the neuromuscular element [29] for each 
control input. The pure time delay is approximated by a second-order Pade transfer 
function, which provides excellent accuracy over the frequency range of the pilot 
(0.1-lOrad/s) [17]; for the single channel we have: 



Yci(s) 



1 - i(r.) + i(r.)^ 



(2) 



> Delay 



Filter 



Uf 



Pilot 



Plant 



Fig. 1 Pilot model: pure delay and second-order filter 
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The second-order filter [29] is written as 

The series of delay and filter on each control channel can be written in linear state 
space form as 

X p = A X p + B u, (4a) 

Uf = C Xp + Du, (4b) 

where jc^ is the neuromuscular state of length 4«„, where the number of control 
inputs flu is equal to 4 for a rotorcraft vehicle. The elements of matrices A, B,C and 
D depend on the delay and filter parameters, and in particular on the time constant 
r of the pure delay, and on the damping factor ^nm and undamped natural frequency 
a)NM of the open-loop neuromuscular system. Referring to Fig. 1 , it should be noted 
that the inputs of the pure time delay module are the "desired" command pilot inputs 
u, while the inputs of the neuromuscular module are the delayed command inputs 
M(/. Finally, the delayed and filtered inputs Uf actuate the rotorcraft vehicle model. 
Collecting together (1) and (4), we can write the governing equations of the cou- 
pled pilot- vehicle model as 

Xp = Axp + Bu. (5a) 

Uf = C X p + D u, (5b) 

f(x.x.Uf,t) = 0. (5c) 

Formally, by collecting all states in a unique state vector xl, = {x^ . x^)^ , by 
collecting all dynamic equations (5) into a single function f^^ and eliminating all al- 
gebraic equations, we can write the governing equations of the coupled pilot- vehicle 
system in the following compact form: 

/p^(i:pv,Xpv,M,0 = 0. (6) 

When optimizing a maneuver considering only the stand-alone vehicle model, 
one uses (1); on the other hand, when the pilot is included in the optimization the 
augmented system (6) is used. Formally, the two are identical, so that no changes 
are necessary to the trajectory optimization software for dealing with the coupled 
pilot- vehicle model. 



3 Formulation of Maneuvers as Optimal Control Problems 

A maneuver can be defined as a dynamic transition between two steady state 
(trimmed) configurations [18], although in the present context it is useful to give a 
looser interpretation of the term by considering also the case of terminal conditions 
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which are not trimmed. Clearly, given a starting and arrival configuration, there is 
an infinite number of ways to transition between the two. A possible way to re- 
move this arbitrariness is to formulate a maneuver as a constrained optimal control 
problem [8-10, 13]. 

The maneuver optimal control problem requires the minimization or maxi- 
mization of a cost or merit function (e.g. time, altitude loss, control activity, fuel 
consumption, etc.), which in general can be expressed in terms of the vehicle states 
or outputs and of the control inputs. Furthermore, the optimization problem is con- 
strained by a number of conditions that should be met by the solution: 

• First, the so-called compatibility conditions must be fulfilled at each time instant 
of the maneuver; in other words, it is required that the computed solution satisfies 
the equations of motion of a suitable flight mechanics model of the vehicle. 

• Second, the solution should remain within the flight envelope and operational 
limits of the vehicle. 

• Finally, most maneuvers of practical interest (Category-A, ADS-33, flare at the 
exit of an autorotation, etc.) are typically characterized by other equality and 
inequality constraints which need to be met in order to satisfy given performance 
and procedural requirements and that, collectively, contribute to giving a precise 
definition of the maneuver of interest. 

The maneuver optimal control problem can be formally expressed as: 

min /=0(j,f) -|- / L(y.u,u,t)dt. (7a) 

'',y,u,T Jo 

s.t.: /(i,x,M) = 0, (7b) 

y = h(x,u), (7c) 

g{y.u,t)^[g"^\g^^^l (7d) 

Solving the problem consists in finding the control function u{t), and hence 
through (7b) and (7c) the associated functions x{t) and ^(0^ which minimize 
the cost / given by (7a). In general, the cost includes a boundary quantity which 
accounts for values of the outputs at the initial and/or final instants, as well as an 
integral cost term. The problem is defined on the interval Q = [0,T],t & Q, where 
the final time T is typically unknown and must be determined as part of the solution 
to the problem. 

The model governing equations appear among the problem constraint conditions, 
and are expressed by (7b) and (7c), where x are the states, u the inputs and y the 
outputs. As shown in the previous section, the model governing equations (7b) can 
be represented by (1) when considering the stand-alone vehicle model, or by (6) for 
the coupled pilot-vehicle case. 

All maneuver-defining and/or envelope-protection constraints are expressed as 
generic algebraic non-linear constraints by (7d). These may include as special cases 
boundary (initial {t = 0) and/or terminal (t = 7")) conditions, constraints at 
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unknown internal time events (t = Ti), generic constraints defined over the whole 
maneuver duration (? e [0, 7"]), which clearly may also include, as it is often the 
case in practical applications, simple bounds on the inputs and states/outputs. 



4 Direct Solution of Maneuver Optimal Control Problems 

As discussed in [11], the direct approach is often the preferable way to solve the 
optimal control problem (7), for a series of practical advantages with respect to 
the classical indirect method. According to the direct approach, the optimal control 
problem is first discretized and subsequently optimized. This procedure yields a 
discrete parameter optimization or NLP problem [19], which can be written as 

min K{z), (8a) 

z 

s.t. a(z) = 0, (8b) 

A(z) e [^'"",6™"], (8c) 

where z is a vector of algebraic unknowns, and ^ is a scalar objective function which 
represents an approximation of the cost / of (7a). The equality constraints (8b) 
are generated by the discretization of the equations of motion (7b, 7c), while the 
inequality constraints (8c) by all other maneuver-defining constraints (7d). Notice 
that the problem defined by (8) is characterized by unknown algebraic parameters z, 
while the optimal control problem (7) by functional unknowns. 

The specific form of the vector of algebraic unknowns and of the constraints 
in problem (8) depends on the method used for performing the discretization. Our 
software program TOP (Trajectory Optimization Program) [11] implements both 
the direct transcription and the direct multiple shooting methods, which are briefly 
reviewed next. 



4.1 Direct Transcription 

This method is very effective and robust, but it is typically applicable only to models 
which have low-moderate complexity [13], i.e. which do not have solution time 
scales which are too fast with respect to the overall maneuver duration, and/or do 
not possess too large a number of states. 

The time interval Q is partitioned as, Q = Iq < t\ < ■ ■ ■ < tj^ = T , where the 
generic time element is Q" = [?„, ?n+i], n = (0, N — 1), of time step size h" = 
tn+\ — tn- On each time element Q" , the governing equations (7b) are discretized 
using a suitable numerical method. The resulting discrete equations are expressed as 

f^(,Xn + l,Xn,u\h") = 0, « = (0,^-1), (9) 
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where //, is an algorithmic approximation of function / of (7b), jc„, x„+i are the 
values of the state vector at t„ and tn+i, respectively, while u" represents the value of 
the control vector within the step. In general there might be additional internal stages 
for both the state and the control variables, depending on the numerical method [11]. 
The NLP problem (8) is defined as follows. First, the NLP vector of parameters 
is chosen as: 

z = (x„=^o,N),u"=^°'''-'\Tf, (10) 

i.e. it is defined by the discrete states and control values on the computational grid, 
and the final time. Notice that, if one needs a very large number of time steps to 
accurately resolve the solution, the size of z will be large, up to the point of making 
this approach unsuitable in terms of computational burden. 

Next, the cost / of (7a) is discretized in terms of z as given by (10), obtaining the 
discrete cost K of (8a). Then, the discretized ODEs within each step, (9), become 
the set of NLP equality constraints appearing in (8b). Finally, all other problem 
constraints and bounds, (7d), are expressed in terms of the NLP variables z and 
become the NLP inequality constraints of (8b). 



4.2 Direct Multiple and Hybrid Single-Multiple Shooting 

Multiple shooting is typically used in applications of moderate/high complexity, 
i.e. with solution time scales which are fast with respect to the maneuver duration, 
and/or a moderate/large number of degrees of freedom [13]. 

The time domain f2 is partitioned as = ?o < fi < ■■■ < tM = T with Q'" = 
[tm, tm+i], m = (0. M — 1), where each i2'" is a shooting segment. In each shooting 
segment f?™, the controls are discretized as u'"(t) = 2Zi = i ■*' (0"™' where s, (t) are 
basis functions, in particular cubic splines in the present implementation, and m™ are 
N^ unknown discrete control values. The control approximations are confined on 
each shooting segment; this has the effect of decreasing the computational cost of 
finite differencing by increasing the problem sparsity. Constraints are enforced at the 
shooting segment boundaries to guarantee the continuity of the controls up to C ^ . 

In the case of direct multiple shooting, the NLP problem (8) is defined as follows. 
First, the NLP unknown parameters are chosen as: 

/ m = (0,M-l) rr\^ ,,,. 

Z = |Xm = (o,M):«,; = (l^^m) .''I , (11) 

i.e. they represent the discrete values of the states at the interfaces between shooting 
segments, the discrete values of the controls within each segment, and the final time. 
Next, the governing ODEs (7b) are marched in time within each shooting seg- 
ment Q'", starting from the initial conditions provided by the values of the states Xm 
at the left boundary of the segment. The effect of the forward integration is to gen- 
erate a discrete time history of states within f2'", which we label x^, / = (1, N'"), 
where N"" is the number of steps taken in that segment. The last value of this 
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sequence is named JCm+i = x'^,„, and represents the new estimate of the state 
variables at the right boundary of the shooting segment. Segments are then glued 
together by imposing the following equality constraints 

x^-x^=0. m = (2,M). (12) 

Multiple shooting segments are used for stabilizing the forward integration of the 
vehicle equations of motion [4]. This is particularly important when analyzing un- 
stable systems, which is often the case when considering rotorcraft vehicles. 

Notice that the size of the unknown parameter vector z is unrelated to the time 
step size used for marching the equations of motion within shooting arcs; hence, 
one may use very fine temporal discretizations without impacting the overall prob- 
lem size, which in fact enables the solution of problems with a higher degree of 
complexity than in the direct transcription case [13]. 

In the direct multiple shooting case, the cost / of (7a) is discretized in terms of 
z as given by (11) and evaluated using the segment time histories xj"; this yields 
the discrete cost K of (8a). Next, the gluing conditions (12) are used to express the 
set of NLP equality constraints appearing in (8b). All other problem constraints and 
bounds, (7d), are expressed in terms of the NLP variables z and become the NLP 
inequality constraints of (8b). 

For complex multibody systems denoted both by slow and fast solution com- 
ponents, we have observed that the satisfaction of the multiple shooting gluing 
constraints can be particularly difficult and usually ends up dominating the prob- 
lem. Once again a rotorcraft multibody model provides for an excellent illustration 
of such difficulties. In fact, models have flight mechanics states which describe the 
gross rigid body motion of the vehicle through the position, orientation, linear and 
angular velocities of a body-attached (or floating, in the case of a flexible fuselage) 
frame of reference, as well as fast scales which are typically related to the rotor 
degrees of freedom, and include rigid and flexible blade states and aerodynamic 
states. 

Often, a naive implementation of multiple shooting fails to achieve convergence 
for such complex multi-scale models. This is not surprising, since the rotor generates 
most of the aerodynamic forces acting on the vehicle and even small variations 
in its states may imply large variations in the resulting forces, which hinders the 
satisfaction of the gluing constraints. 

We have found that these problems can be alleviated by using multi-time scale 
arguments [14]. In fact, the rotor states (both structural and aerodynamic) are sig- 
nificantly faster than the flight mechanics ones. Thus, since the multiple shooting 
treatment of these fast states is the main cause of the two aforementioned issues, i.e. 
raise in computational cost and difficulty in satisfying gluing constraints, one can 
think of treating slow and fast scales using different methods. 

More specifically, a multiple shooting approach is used for the slow states. This 
is crucial, since with single shooting small changes early in the trajectory can pro- 
duce dramatic effects at the end of it [4]; clearly, the problem is exacerbated when 
analyzing unstable systems. Hence, the multiple shooting treatment of slow scales 
avoids the blow up of the solution. 
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Fig. 2 Hybrid single-multiple shooting approach 




Tm+l 



Tm = t-N 



On the contrary, fast scales are treated using a single shooting approach, as 
depicted in Fig. 2. This does not compromise the robustness of the procedure, since 
fast scales will not diverge if slow ones do not; hence, the stabilizing effect produced 
by the multiple shooting treatment of slow scales is felt also at the level of the fast 
ones. 

With such a hybrid single-multiple shooting approach, the size of the resulting 
NLP problem is substantially reduced and so is the total computational cost. Fur- 
thermore, there are no gluing constraints to be enforced for the fast rotor states, since 
only the slow states need to be glued together at the shooting interfaces. This has 
the effect of greatly increasing the robustness of the procedure, and the convergence 
speed. 



5 Applications and Results 

In this section we consider the solution of maneuver optimal control problems of 
practical interest. We analyze the ADS-33 Lateral Reposition Mission Task Element 
(MTE) [3] for handling qualities assessment, as well as a Category-A fly-away [2]. 
Goal of these two examples is a first preliminary assessment of the effects of the 
inclusion of the simplified pilot model described earlier on in this work with respect 
to the computed limit performance. 

The helicopter model, implemented using the rotorcraft multibody FLIGHTLAB 
code [1], represents a generic medium-size multi-engine four-bladed utility vehicle 
in the 9 ton class. 



5.1 Lateral Reposition MTE 



The ADS-33E-PRF specification [3] for military rotorcraft defines a series of MTEs 
which provide a basis for an overall assessment of the vehicle ability to perform 
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certain critical tasks, and result in an assigned level of handling qualities according 
to the Cooper-Harper rating scale. Each MTE is related to a maneuver that shall 
be accomplished considering specific constraints, as described in [3]. In fact, it is 
possible to formulate each MTE as a constrained optimal control problem [12]. 
Hence, with a software implementation of the procedures discussed in this work, it 
is possible to readily compile a library of MTEs of interest in order to predict the 
handling qualities characteristics of a specific rotorcraft. 

We analyze here the Lateral Reposition MTE, considering both the stand-alone 
vehicle model and the coupled pilot- vehicle system. 

According to the Lateral Reposition MTE [3], the helicopter, initially in hover, 
is supposed to translate laterally for 400 ft and then recover the initial hover con- 
figuration. The maneuver must be flown in ground effect since the initial and final 
positions are characterized by an altitude of 35 ft (the rotor diameter is 30 ft); al- 
titude variations must be within ±10 ft. Referring to Fig. 3, the maximum allowed 
displacement in the longitudinal direction is ±10 ft, while the maximum heading 
misalignment is ±10deg with respect to the initial direction. The maneuver must 
be completed within 18 s. 

One possible formulation of this MTE is to consider the following minimum time 
cost function (13): 



J = T +- iiWudt. (13) 

T Jo 

The first term enforces the minimum time condition, while W = diag(M'u) is a di- 
agonal matrix of tunable weighting factors which penalize the control rates. 

It is also necessary to constrain the vehicle trajectory so as to express the MTE 
path requirements described above. With this formulation of the problem the time 
constraint is not explicitly enforced, but verified a posteriori. In other words, one 
tunes the weight parameters W in the merit function (13), this way controlling the 
aggressiveness of the maneuver. Then, once a solution has been computed, one ver- 
ifies whether the maneuver was rapid enough and effectively completed within the 
maximum allotted time. Obviously there are limitations in the maneuver aggres- 
siveness related to the vehicle capabilities and its flight envelope constraints. In this 
case the trajectory constraints are imposed directly through bounds on the position 
variables and heading angle: 

\fit)\ < lOdeg, (14a) 

\x{t)\ < 10ft, (14b) 

\Az{t)\ < 10ft, (14c) 

0< y(f) < 400ft. (14d) 



We solved this problem initially without considering a pilot model; once the 
"pilot-off" solution had been evaluated, we used it as the initial guess for the evalu- 
ation of the "pilot-on" case. The following values for the pilot actuator model were 
used [29]: r = 0.2s, ojnm = lOrad/s, ^nm = cos(jr/4). 
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Fig. 3 Lateral Reposition MTE: snapshots 
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right) 



Lateral Reposition MTE: X, Y, Z positions, and lieading angle (from top to bottom, left to 



Figure 3 shows some snapshots of the helicopter during the maneuver. Figure 4 
gives the time histories of the constrained path variables, for both the pilot-off (solid 
lines) and the pilot-on cases (dash-dotted lines). Figure 5 shows the control time 
history; in the pilot-on case, both the computed pilot model inputs (vector u in (5)) 
and the plant inputs (vector Uf in (5)) are shown in the figure. 
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Lateral Reposition MTE: collective, pedal, lateral and longitudinal cyclic (from top to 
left to right) 



The pilot-in-the-loop effects do not appear to generate significant differences 
with respect to the stand-alone vehicle model for both the trajectory and the control 
inputs. The maneuver duration is in both cases less than the 18 s prescribed by the 
normative, with a slightly longer total time for the pilot-on case. Both trajectory and 
controls do not not appear to have been significantly affected by the neuromuscu- 
lar lag. 



5.2 Category-A Fly-Away 



The effect of pilot actuation are investigated also for the case of a fly-away maneuver 
under Category A certification requirements [2]. A meaningful simulation policy for 
such a maneuver consists in the minimization of the altitude loss, according to the 
cost 



1 C^ 

J = H{T) + - iiWiidt, 
T Jo 



(15) 
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The initial condition is a hover. A latency period of 1.2 s after the engine failure 
is taken into account, during which the pilot realizes the situation and there is no 
control activity. The power loss is modeled as 



Pa.(t) = Ph + (Poei - Ph) K+it) + Ph K-{t), 
K+{t) = sca(r-ro)(l-e"'/''^), 
K-{t) = sca(r - to)e-'^'-'°^l''~ , 



(16a) 
(16b) 
(16c) 



where ?o is the instant of engine failure, Ph is the hover power, Pqei = 1750 HP 
is the one engine inoperative maximum take-off power available, while t"*" = 2/9 s 
and r~ = 1/9 s are suitable time constants. An inequality constraint in the maneu- 
ver optimal control problem (7d) is used for ensuring that the power generated by 
the engine is at all times less than the available one, as expressed by (16a). The final 
conditions are 



W{Tf) = Om/s, 
p{Tf) = q{Tf) = r{Tf) = OdQgl^, 
Q{Tf) > 90%. 



(17a) 
(17b) 
(17c) 



All simulations were conducted outside of ground effect. This single-phase formula- 
tion of the problem considers only the first part of the maneuver, i.e. from the engine 
loss to the moment the lowest point in the trajectory is reached. A multi-phase 
formulation of the same problem covering also the climb part of the Category-A 
maneuver was considered in [8-10]. 

The standard procedure is to fly this emergency maneuver in the longitudinal 
plane of the helicopter. In fact, simulations of this maneuver are often conducted 
with a two-dimensional helicopter model. However, using a three-dimensional 
model, one may observe that the solution converges to a three-dimensional maneu- 
ver with significant yaw and roll (see Fig. 6). The three-dimensional (3D) optimal 
maneuver altitude loss '^H^^ = 15.77 m improves on the two-dimensional (2D) 




Fig. 6 Category A, fly-away: optimal two-dimensional (left) and three-dimensional (right) trajec- 
tories, pilot-off case 
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Fig. 7 Category A, fly-away: altitude loss comparison (left) and inertial velocities (right), pilot-off 
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Fig. 8 Category A, fly-away: helicopter attitude, pilot-off case 



optimal altitude loss ^/^^° ~ 16.72m of about one meter (Fig. 7 at left), as con- 
firmed by the opinion of test pilots. This gain can be explained by observing Fig. 8. 
In the 3D maneuver, both the roll and yaw angles increase and reach their respective 
maxima approximatively halfway throughout the maneuver. This attitude allows for 
some reduction in the vertical velocity (Fig. 7 at right), which explains the decreased 
altitude loss. Clearly, the control activity on the pedal and lateral cyclic is higher for 
the 3D maneuver than for the 2D one. 

These two simulations were repeated including the pilot model. To simplify con- 
vergence, we used a bootstrapping procedure. The first guess was initialized to the 
solution computed without pilot model. Next, the control time histories of the guess 
solution were used for evaluating the pilot model dynamic constraints, thus obtain- 
ing initial estimates of the pilot state time histories. The stand-alone vehicle solution 
augmented with the pilot state time histories was then used as initial guess for the 
pilot-in-the-loop optimization. 

For the coupled pilot-vehicle problem, the resulting optimal maneuvers do not 
change significantly in terms of control input profiles with respect to pilot-off sim- 
ulations, but the altitude loss increases for both the 2D and 3D cases (see Fig. 9). 
In the 2D maneuver the altitude loss is ^H^^ = 18.56 m with a difference of 
1.84m (10.38 %) with respect to the pilot-off case, while in the 3D case we obtain 



AH. 



3D 



17.62 m with a difference of 1.85 m (11.73 %). This is not simply due to 
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Fig. 10 Category A maneuver, fly-away: inertial velocities (left: 2D; right: 3D) 



the fact that all time histories are delayed. The principal reason appears to be the 
delay in the pilot first reaction to the engine loss, which gives a higher maximum 
vertical velocity value, as shown in Fig. 10. 

In conclusion, the introduction of a pilot model seems to have a non negligi- 
ble effect on the performance estimation, which would seem to motivate further 
refinements in the simplified pilot model considered in this preliminary study. Fur- 
thermore, it appears that a 3D maneuver gives better performance (less altitude loss), 
than the usual 2D one. However, the 3D maneuver is harder to fly since it requires 
good coordination skills. Moreover, the pronounced sideslipping might make it dif- 
ficult for the pilot to hold the visual references. 



6 Concluding Remarks 



In this work we have formulated a trajectory optimization approach to maneuver 
modeling in rotorcraft flight mechanics, including pilot-in-the-loop effects and fast 
dynamic solution components. 

The formulation can accommodate the pilot control behavior as part of the def- 
inition of the cost function (and in this sense falls within the category of optimal 
control pilot models), as well as the command actuation and sensorial perception 
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aspects. In this work, the command actuation was rendered using global equiva- 
lent models through the use of a simple delay in series with a second order filter. 
Although a more sophisticated model, as for example a biomechanical multibody 
model, could be readily implemented in the formulation without conceptual diffi- 
culties, the present implementation is probably sufficient for capturing the relevant 
command actuation effects on the flight mechanics characteristics of the response. 
The sensorial perception component of the model was not considered here, mainly 
for the lack of sufficiently reliable data for the tuning of the required Kalman-based 
observers; this aspect of the problem is currently under investigation, and will be re- 
ported in a forthcoming publication. It is reasonable to speculate that the inclusion of 
the perception system model will determine further degradation of the performance, 
although the actual quantification of this aspect remains to be seen. 

Based on the current state of this study, the following conclusions may be drawn: 

• The performance degradation due to pilot-in-the-loop effects depends on the 
particular maneuver considered. In particular, it appears that for the Lateral Repo- 
sition MTE the pilot model induces negligible differences, while the Category-A 
rejected take-off shows a more pronounced effect with an increased altitude loss. 
Other maneuvers will be considered in the continuation of the present study. 

• The inclusion of a pilot model in the optimal control formulation does not imply 
substantial difficulties, since the coupled pilot-vehicle system is formally identi- 
cal to a generic vehicle model expressed in non-linear state space form. 

• The current version of the pilot model has only a modest impact on the com- 
putational cost of the optimization, so that the code retains its ability to conduct 
complete maneuver simulations in the order of minutes on standard desktop com- 
puters. 

• As for all optimization problems, better performance and robustness of the pro- 
cedures relies also on good initial guesses of the solution, which in this case also 
requires initial estimates of the internal pilot states. This was achieved here using 
a bootstrapping procedure, based on an initial solution computed without pilot 
model, followed by the initialization of the pilot states obtained with the com- 
puted pilot-off control inputs. This procedure proved to be easy to implement 
and very effective. 
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Optimization of Multibody Systems 
and Their Structural Components 

Olivier Briils, Etlenne Lemaire, Pierre Duysinx, and Peter Eberhard 



Abstract This chapter addresses the optimization of flexible multibody systems 
based on the dynamic response of the full system with large amplitude motions and 
elastic deflections. The simulation model involves a nonlinear finite element formu- 
lation, a time integration scheme and a sensitivity analysis and it can be efficiently 
exploited in an optimization loop. 

In particular, the paper focuses on the topology optimization of structural compo- 
nents embedded in multibody systems. Generally, topology optimization techniques 
consider that the structural component is isolated from the rest of the mechanism and 
use simplified quasi-static load cases to mimic the complex loadings in service. In 
contrast, we show that an optimization directly based on the dynamic response of 
the flexible multibody system leads to a more integrated approach. 

The method is applied to truss structural components. Each truss is represented 
by a separate structural universe of beams with a topology design variable attached 
to each one. A SIMP model (or a variant of the power law) is used to penalize inter- 
mediate densities. The optimization formulation is stated as the minimization of the 
mean compliance over a time period or as the minimization of the mean tip deflec- 
tion during a given trajectory, subject to a volume constraint. In order to illustrate 
the benefits of the integrated design approach, the case of a two degrees-of-freedom 
robot arm is developed. 



1 Introduction 

This chapter addresses the optimization of flexible multibody systems with large 
amplitude motions and elastic deflections. For example, in deployable space struc- 
tures, piston engines, automotive suspensions, robots and high-speed machine-tools. 
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the articulated components undergo large displacements and elastic deformations, 
and are subject to transient loads and nonlinear dynamic effects. The performance 
of such systems often depends on the mechanical design in a non-intuitive way. 

Several researchers have addressed the optimization of the geometric parameters 
of rigid mechanisms, see among others [9, 18, 24], and also of the connectivity of 
mechanisms made of rigid members as in [27, 36]. Synthesis methods based on 
exhaustive search among possible combinations of links and joints were studied 
in [35]. Optimization techniques have also been exploited to solve optimal control 
problems in multibody dynamics, see for instance [11]. 

Initially developed in structural optimization, topology optimization techniques 
have often been used to optimize the layout of isolated linear elastic structural com- 
ponents under fixed loadings. Layout optimization of structures without any prior 
knowledge on the structural topology can be worked out by formulating the prob- 
lem as an optimal material distribution on a given design domain, see [8] for details. 
As the optimal material distribution problem is generally solved numerically using 
a finite element discretization approach, the design domain is divided into finite el- 
ements and an existence variable is attached to each element. The optimal material 
distribution problem could be solved as a discrete valued problem, but this approach 
would require extensive computational resources because of its highly combinato- 
rial nature. The solution of the discrete problem can be avoided by considering an 
alternative formulation in which the discrete existence variables are replaced by 
continuous density parameters running from void to solid via all intermediate den- 
sities. The density field may be interpreted as the spatial distribution of a fictitious 
porous material. This continuous formulation presents the advantage to allow using 
sensitivity analysis and mathematical programming algorithms to solve the problem 
in an efficient way. 

In most cases, the modelling of the intermediate density properties is based on 
the power-law model, also called SIMP model [6]. The effective Young's modulus 
E and the effective density p are given in term of the continuous existence variable 
X by 

E = x'' Eo. p = X po (1) 

where the index denotes the solid material properties. The factor p is introduced 
to penalize the intermediate densities in order to end up with contrasted "black and 
white" designs. A typical topology optimization result is presented in Fig. 1. In the 
final density map, black elements represent solid parts which belong to the optimal 
structure while white elements represent voids without any mechanical resistance. 

Some extensions of topology optimization have been proposed for components 
with nonlinear geometric conditions [15, 34], nonlinear material behaviour [29] or 
fast dynamic effects [33]. In our particular case, interesting extensions of topology 
optimization are also concerned with the design of compliant mechanisms [32, 37]. 
In this case, the mechanism is considered as a whole, and the design results in 
massive beam-like components with compliant hinges. The compliant mechanisms 
treated in these works are usually not subject to high inertia effects coming from fast 
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Fig. 1 Formulation of 
topology optimization as an 
optimal material distribution 
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motions of the motorized joints as in multibody systems dynamics and the compUant 
hinges generally do not undergo large rotations as in kinematic joints (hinges, 
sliders, etc.) under the action of attached motors or actuators. 

This paper addresses the optimization of mechanisms composed of structural 
components and discrete kinematic joints. Due to their large motions, such mech- 
anisms cannot be modelled as compliant mechanisms, but must be treated as 
multibody systems. When applying classical topology optimization techniques, one 
may consider that each structural component is isolated from the rest of the mech- 
anism and use simplified quasi-static load cases to mimic the complex loadings in 
service. However, two main drawbacks are associated with this approach. Firstly, 
defining the equivalent load cases is a rather difficult task, which is often based on 
trials and errors and which requires some expertise. In [26], a systematic definition 
of the quasi-static loads from the transient response of the system is proposed. How- 
ever, this method leads to a large number of load cases (typically, one load case per 
time step) and both the transient analysis and the static optimization procedure have 
to be repeated several times if the loads strongly depend on the mechanical design. 
Secondly, topology optimization is often sensitive to loading conditions, especially 
for multiple load cases and stress constraints (see [8] for illustrative examples), so 
that the optimal character of the resulting design can become questionable if the 
loading is approximative. For these reasons, in order to obtain better optimal lay- 
outs, this paper proposes an optimization procedure directly based on the dynamic 
response of the full flexible multibody system. 

Literature reports some attempts to combine topology optimization with multi- 
body dynamics for the design of structural components. In [2, 3, 31], a design 
procedure is proposed where each iteration of the optimization process involves 
two sequential steps. First, the dynamic response is computed using a model of the 
flexible multibody system, which is based on a floating frame of reference approach 
and on a modal representation of the structural flexibility. Second, the topology op- 
timization is performed and the finite element model of the structural component 
is updated accordingly. At each iteration, the modal representation of the struc- 
tural components should thus be recomputed from the updated finite element model. 
Considering the complexity of the resulting software architecture, it seems that the 
computation of the sensitivities is a challenging problem, so that gradient-based op- 
timization techniques may not be utilized without significant approximations. 
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In order to overcome those limitations, a more integrated topology optimization 
technique is proposed here, based on the nonlinear finite element approach for flexi- 
ble multibody systems described in [23]. The method is similar to the usual approach 
used in topology optimization in which the continuum domain is discretized into fi- 
nite elements, see [8]. The nonlinear finite element formalism accounts for both 
large rigid-body motions and elastic deflections of the structural components. The 
design variables are classically density-like parameters associated to a SIMP law 
interpolation of effective material properties, see (1). 

The nonlinear equations of motion are solved using a generalized-a time in- 
tegration scheme, see for instance [4], and the sensitivity analysis of mechanical 
responses is based on a direct differentiation method as described in [12]. The 
efficient solution of the optimization problem relies on the sequential convex pro- 
gramming concept at the core of the CONLIN software [22]. 

In the present study, the method is applied to truss components, which are mod- 
elled using the flexible beam finite element available in our multibody simulation 
code. Each truss is represented by a structural universe of beams with a topology 
(i.e. existence) design variable attached to each one. The optimization formulation 
can be stated as the minimization of the mean compliance over a time period or 
as the minimization of the mean tip deflection during a given trajectory, subject to 
a volume constraint. In order to illustrate the beneflts of the integrated design ap- 
proach, the case of a two degrees-of-freedom robot arm is developed. 



2 Optimization of Flexible Multibody Systems 
2.1 Equations of Motion 

A flexible multibody system can be modelled using the nonhnear finite element 
method proposed in [23]. After finite element discretization, the motion of each 
flexible body is represented by absolute nodal coordinates, which are gathered in 
the vector q. The kinematic joints which connect the different bodies impose a set 
of nonlinear kinematic constraints between nodal coordinates, which are noted as 
^ (q, t) = 0. The model of a multibody system has the general form 

M(q,x)q = g(q,q,x,f)-«P^A, (2) 

«P(q,x,f) = (3) 

with the initial conditions at time ? = s 

q(0) = qo(x), (4) 

q(0) = qo(x). (5) 
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Equation (2) represents the dynamic equilibrium and (3) the kinematic constraints. 
The mass matrix is denoted by M, which is not constant in case of large rotations, 
g = g"' — g"" — g''""' — g^^*^ gathers the external, internal, damping and complemen- 
tary inertia forces, 0,q is the constraint gradient and A is the vector of Lagrange 
multipliers. 

The equations of motion (2) and (3) depend on a set of « design variables x which 
can be related with the geometry of the system, its topology, its physical data or its 
applied loads. For given values of the design parameters x, the dynamic response 
q(x, f), A(x, f) is defined as the solution to the system of differential-algebraic 
equations (2)-(5). 



2.2 Formulation of the Optimization Problem 

We consider the general form of optimization problems in multibody dynamics with 
inequality constraints and bounds for the design variables 

minx /o (x) 

(6) 



s.t. 



fj(^)Sfj, y = i,...,m 

i = 1, ...,n 



where the objective function /o(x) and the design constraints fj (x) (7 > 1) depend 
on the dynamic response q(x, f ), A (x, ?). Introducing the compact notation 

z^(x,r) = [q^(x,0 q^(x,0 q^(x,0 A^(x,0], (7) 

the objective function and the design constraints take the general form 



f(x) = / G(z(x,0,x,f)Jf -hF°(z(x,0),x)-hF^(z(x,f/),x,f/). 



(8) 



In this expression, the integrand G accounts for the dynamic behaviour during the 
complete time interval [0, fy], while the functions F'' and F-^ specifically account 
for the initial and final states. The following developments could be extended to 
situations where the final time tf also depends on the design variables. 



2.3 Time Integration Method 

Equations (2) and (3) form a set of nonlinear differential and algebraic equations. As 
suggested in [23], it can be solved using the generalized-a method [17]. Despite the 
presence of algebraic constraints and despite the non-constant character of the mass 
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matrix, this scheme leads to accurate and reliable results, provided the introduction 
of a small amount of numerical damping [4]. Let us briefly describe the formulation 
of this algorithm. 

At time step n + I, the numerical variables Qn+i, q«+i, q«+i and A„+i have 
to satisfy the coupled system (2) and (3). According to the generalized-a method, a 
vector a of acceleration-like variables is defined by the recurrence relation 

(1 -Q!m)a„+i +a!ma„ = (1 -Q!/)q„+i +«/(!„, ao = (jo, (9) 

and the integration scheme is obtained using a in the Newmark integration formulae 

q«+i =qn+hq„+h^(--na„+Ppa„+i, (10) 

qn+i = qn + h(l - y)an + hya„+i (11) 

where h denotes the step size. Second-order accuracy and unconditional stability 
is guaranteed if the algorithmic parameters a/, am, P and y are properly selected 
according to [17]. 

For one time step, the numerical solution is computed using Algorithm 1 , which 
actually solves (9)-(ll) together with the dynamic equilibrium at time ?„+i. The 
Newton iterations try to bring the residuals r = Mq — g + ^X and 4> to zero using 
the linearized form of (2) and (3) 

MAq + CtAq + KtAq + ^^^AX = Ar, (12) 

<?»,qZ\q = Z\0 (13) 

where C? = 9r/9q and K; = 3r/3q denote the tangent damping and stiffness 
matrices. It can be demonstrated, see [4], that the iteration matrix of the algorithm 

is given by 

" [ ^'^ « . 

with p' = {l- am)/(h^p{\ - a/)) and y' = y/{hp). 



2.4 Evaluation of the Objective Function and of the Design 
Constraints 

In order to evaluate the objective function and the design contraints f(x), it is 
convenient to introduce an intermediate variable y(x, t) defined by the differential 
equation 

y(x,0 = G(z(x,0,x,0 (14) 
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Algorithm 1 [q„+i,q„+i,q„+i,A„+i,a„+i] = AlphaStep {q„,q„,q„,a„) 



= 

= 1/(1 -a,„)iafq„ - a,„a„) 

= q„ + hq„ + hHQ.5 - ^)a„ + h^Pa„ + i 

= q„ + hil - y)a„ + hya„ + i 

:=0 



3,1 + 1 

q«+i 
qn+i 

for ;■ = 1 to ii„ax do 

Compute the residuals r and 
if VT|rlF+l*F < tol then 

break 
end if 





Aq- 

AX 


■■= -sr' 
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q^+i := qn+1 + ^q 




kn+i := qn+i + y'M 




q,, + i := q„ + i + ^'Aq 




^,, + i '■= -^n + l + ^^ 




end for 
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-I '■ = 


a„ + i + (1 - 
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)/(l- 


-"m)q«+i 



with the initial condition 

y(x,0) = F°(z(x,0),x). 

As a consequence, f(x) is computed from y according to 

f(x) = y(x,f/) + F^(z(x,f/),x,f/). 



(15) 



(16) 



Equation (14) is easily solved by time integration, for instance, using an adap- 
tation of Algorithm 1 for first-order differential equations [13]. Accordingly, an 
intermediate variable w is introduced such that 

(1 -Q'm)w„+i -FamW„ = (1 -a/)y„+i -Fa/yn, wo = yo, (17) 

and the integration scheme is obtained using w in the integration formulae 

Yn+i =yn+h{l-y)wn +hywn+i. (18) 



2.5 Sensitivity Analysis 

Gradient-based optimization codes usually require the sensitivities of the simula- 
tion results with respect to the design parameters. For problems involving a rather 
large number of design parameters, e.g. in topology optimization, the efficient and 
reliable computation of those sensitivities is an important issue. Indeed, the finite 
difference technique, which is based on repeated simulations with perturbed values 
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of each design parameter, quickly becomes inefficient in this case. Therefore, more 
efficient algorithms should be implemented for the sensitivities, such as automatic 
differentiation [20] or semi-analytical approaches [10, 12]. 

In this paper, the semi-analytical direct differentiation method is presented for 
the computation of the sensitivities, i.e. the sensitivities are computed by differenti- 
ation of the time integration algorithm. For a given design parameter x, we use the 
notation (•)' = d{»)/dx. The sensitivity of the objective function and of the design 
constraints with respect to this design parameter is computed using the chain rule of 
differentiation 

f = y'(x, tf) + F;{(Z(X, tf), X, tf) Z'(X, tf) + F;^(z(x, r/),x, tf) (19) 

where F;^ (resp. F;^) represents the partial derivative of F-^ with respect to the 
parameter z (resp. x). In this expression, the sensitivities z' and y' can be computed 
as described below. 

According to the direct differentiation method, the sensitivities z'=(q', q', q'. A') 
are obtained by solving the differentiated form of (2) and (3) 

Mq' + Qq' + K,q' + «P^A' + r;, =0, (20) 

<?^,qq' + <?^,x = (21) 

with the initial conditions 

q'(0) = q^, (22) 

q'(0) = q'o- (23) 

The partial derivatives r^x and ^x which appear in those equations are sometimes 
referred to as pseudo-loads. Even though the dynamic equilibrium is nonlinear with 
respect to q, q, q and A, one observes that the sensitivity equations (20) and (21) are 
linear with respect to q', q', q' and X' . 

At time step n -\- \, the sensitivities can be computed using the same integration 
algorithm as for the dynamic response, i.e. 

W„+^,i^„+^:q[„+^.X'„+^,i^„+^= AlphaStep' {((„,i(„,t{„,^„) . (24) 

More precisely, AlphaStep' is the same algorithm as AlphaStep, excepted that the 
residuals r and ^ are replaced by the residuals of (20) and (21). Since (20) and (21) 
are linear, a single Newton iteration is sufficient to get the exact values of the sen- 
sitivities at the current time step. In particular, the iteration matrix S? is the same as 
for the original problem. Hence, even for a large number of design parameters, this 
matrix should be computed and factorized only once for the sensitivity analysis at 
time step n + \. However, while an approximate iteration matrix is often sufficient 
to achieve convergence for the original problem, an exact expression is necessary to 
solve the sensitivity problem in a single iteration. 

In a similar way, the sensitivity y' satisfies the differentiated form of (14) 

y' = G,.z' + G,;, (25) 
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Algorithm 2 [f, f] = ObjectiveFunctionAndSensitivities{x) 
initialize qo, qo, qo, ao, yo, yo, wo 
initialize qo, qo. qo. Sq, yo, yo. Wq 
for n = to «/ — 1 do 

[qn+i,qn+i.qn+i--^n + i-an + i] = AlphaStep (<\„ . q„ .({„ .A,,) 

Yn + I = G(z„+,,x, ?„ + ,) 

w„ + i = ((1 - a/)y„+i + a/y,, - ff„,w„)/(l - a,,,) 

y„+i = y„ + H^ - y)w„ + hyw,,^, 

K + 1 ■ q', + r ii'.+i ■ -^'. + 1 • < + i] = AlphaStep' (q,', , q,',, ij;, , a,',) 

y:,+, = G,,z' + G., 

w,' + i = ((1 - «/)y,' +1 + "/J'l ~ «mW,',)/(l - a„,) 

y;,+i = y,', + h{\ - k)w;, + /!Kw;,_^, 

end for 

f=y„, +F''(z„,,X,fyO 

r = y,', , + F:{ (z„ , , X, tfK^ + F:{- (z„, . x, //) 



with the initial condition 

y'(0) = F°,q'o+F°.q'o + F«,. (26) 

In summary, the dynamic response, the objective function, the design constraints 
and their sensitivities can be computed efficiently in a single but extended simula- 
tion. In the case of a single design parameter, the complete procedure is described 
in Algorithm 2. The implementation effort to develop this algorithm in an existing 
simulation software is limited since the core routine AlphaStep does not need to be 
modified and since the sensitivity routine AlphaStep closely resembles it. 



2.6 Optimization Algorithms 

Several optimization methods have been applied to solve problems in structural 
and applied mechanics. Depending on the problem characteristics and the avail- 
able information, one or several of these methods can be selected. On the one hand, 
heuristic methods such as Genetic Algorithms [5] or Particle Swarm methods [36] 
are algorithms inspired by natural phenomena. These algorithms only require the 
computation of the design function values and their global convergence can be 
guaranteed even for non-convex problems with local minima. They can tackle prob- 
lems with discrete valued variables and non-smooth functions. However, the main 
drawback of these methods is their slow convergence rate and the large number 
of function evaluations usually needed to reach the optimum, which results in a 
high computational load for large scale systems. Thus, they are generally restricted 
to problems with a small number of design variables (about 10). On the other hand, 
one can take advantage of mathematical programming (MP) methods, which usually 
require the design function derivatives. Their convergence speed is generally higher 
than for heuristic methods. MP methods have been successfully applied to solve 
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large scale structural and multidisciplinary optimization problems. Furthermore, 
due to their high convergence speed, the optimal solution can be obtained within 
a limited number of iterations and function evaluations. Amongst the well-known 
applications of mathematical programming, we can mention CONLIN [22], MMA 
and its extensions [14, 38, 39], FAIPA [25] or IPOPT [40]. 

CONLIN, MMA and its extensions are based on the so-called sequential convex 
programming approach, which relies on two concepts sketched in Fig. 2. 

1 . The original optimization problem in (6) that is highly non-linear and implicit 
in the design variables is replaced by a sequence of explicit and convex sub- 
problems, that are built based on a local approximation of design functions 



minx /o (x) 

fji^)slj, j = l,...,m. 



s.t. 



(27) 



Xi < Xi < Xi 



i = 1, 



,«. 



The local approximations are established using the sensitivities as well as a vari- 
ant of the Taylor series expansion of the design functions. 
2. Each local convex subproblem is solved efficiently using fast and effective 
mathematical programming algorithms such as Lagrangian maximization (dual 
method) or interior point methods. 

Dual methods allow to reach the optimum of the local convex subproblem within 
a limited number of iterations independent of the number of design variables. The 
concept has proved to be very general and efficient in topology optimization prob- 
lems, see e.g. [15, 33, 37]. 

In the present work, CONLIN [22] has been selected for its fast convergence 
properties for large scale topology optimization problems. The sensitivities shall 
be efficiently evaluated as shown in Sect. 2. CONLIN, which is an acronym 
for CONvex LINearization, relies on a particular first order Taylor expansion 



Fig. 2 Iterative solution 
of structural optimization 
problems using a sequential 
convex programming 
approach 




Original Constraint 
f(x) = 

Convex 
approximation 

f(x) = 
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Fig. 3 CONLIN 
approximation [14] 
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based on a combination of both direct variables and reciprocal variables 1/x,. The 
direct variable expansion is used when the first order derivative is positive while the 
reciprocal expansion is exploited when their first order derivative is negative 



A(x)=/j(x»)+i:5i(. 



>fl-i:<'")^t(^ 



1 



XT 



(28) 



where x° is the expansion point in state space, ^_|_ is the sum over all the terms 
for which the derivative is positive and ^_ is the sum over all the terms for which 
the derivative is negative. As illustrated in Fig. 3, f{x) is approximated by a linear 
function fi(x) or by a reciprocal function fr{x) depending on the sign of its first 
derivative at the point of approximation. It can be demonstrated that the CONLIN 
scheme (28) is unconditionally convex and that it is the most conservative approx- 
imation that can be generated with linear and reciprocal variables. This means that 
the approximation (28) tends to lie in the feasible domain of the constraint. It fol- 
lows that the CONLIN method mostly tends to generate feasible new solutions. The 
convex and separable character of the approximated functions allows the use of dual 
optimizers and second order maximization algorithms for the sub-problems. 

For the applications treated in this paper, a coupled software interface is used 
where the dynamic response and the sensitivity analysis are computed using the 
OOFELIE finite element software [16] and the optimization of the design variables 
is achieved by the CONLIN software [22]. 



3 Topology Optimization Techniques 



Topology optimization techniques rely on a finite element discretization of the con- 
tinuous elastic domain. The design variables are the pseudo-densities (existence 
variables) of the elements of the mesh. The modelling of material properties with 
intermediate densities is based on the SIMP model in (1). This power law decreases 
the stiffness (i.e. the efficiency) of intermediate densities while the cost in terms of 
material volume stays linear. The available amount of material being limited, the use 
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of intermediate densities is penalized and the optimal design usually ends up with 
mostly void and solid regions. The exponent /» in (1) is classically chosen equal to 3. 
In the equations of motion, the density variables only appear in the expression of 
the inertia and internal elastic forces. Those forces are computed by a finite element 
assembly procedure 

Tie 

g'-'^'^Cq, q, q, X) = ^ Ljgf'(q,, q„ q„ x,), (29) 

e=\ 
He 

g'"'(q,x) = ^Lfgf(q,,x,) (30) 



where n^ is the number of elements of the mesh, g'""''' is the vector of inertia forces 
g'^'-^'Cq, q, q. x) = M(q. x)q + g^-(q, q, x) (31) 

and Lg is the Boolean localization matrix of the element dofs 

qe = L^q. (32) 

Since the elementary inertia forces depend linearly on p and the elementary elastic 
forces depend linearly on E, the SIMP law yields 

gf"(q., q., qe,xe) = x, gf''(q„ q„ q„ 1), (33) 

gf(q„x,)=xi'gf(q„l). (34) 

In those equations, the right-hand-side depends on the inertia force and the internal 
force of the full-density element, which is readily available in a standard finite el- 
ement simulation software. As a consequence, the relation between the residual r 
and the set of design parameters x is known analytically and the pseudo-load can be 
computed on the element level 

(r.). = gf '(q.,q.,q., 1) + /^xf"' gf (q., 1). (35) 

The global pseudo-load is then obtained by numerical assembly according to 

rie 

r, = ^Lj(r,),. (36) 

e=l 



3.1 Application to the Design of Static Trusses 

In order to illustrate the principle of topology optimization techniques, the design 
of static trusses is first considered. Truss topology optimization has been initially 
investigated by Michell [30] at the beginning of the 20th century. The application of 
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numerical methods to the truss optimization problem is more recent and has been 
proposed in [19] and [21] leading to the so-called ground structure approach, see 
[28] for a review. While the problem was classically formulated in terms of stresses, 
several developments have been performed to establish a displacement formulation 
of the problem, see e.g. [1,7]. 

Starting from an initial truss, i.e. the ground structure, the objective of the method 
is to remove the less efficient structural members. In practice, the ground structure is 
often created by connecting a set of chosen nodes with rods or beams in all possible 
ways. Then, as for topology optimization of continuum structures, a density variable 
X is attached to each structural member. Inspired by topology optimization methods, 
the modelling of material properties with intermediate densities can be based on the 
SIMP model in (1). 

For example, the design problem can often be stated as the minimization of the 
compliance 



if.-] 

2]v 



, . He dV (37) 

2 , 

where t denotes the strain vector, H is the Hooke tensor, and V is the volume of the 
structure. A design constraint is imposed on the maximal volume of material 

ftp 

V = J2^eVf„„,e<r]Vf,„ (38) 

e = l 

where He is the number of bars, V/«//,e is the volume of the full bar e, V/uii is the vol- 
ume of the full truss and ?7 is a coefficient between and 1 . Moreover, each density 
is bounded by x„,„ < x < 1 . In the numerical applications presented hereafter, we 
have selected the values rj = 0.4 and x„„„ = 0.01. 

In order to illustrate the topology optimization of static trusses, let us consider a 
linkage composed of cylindrical beams and subject to a static tip load, as shown in 
Fig. 4. The beams are made of aluminum (density = 2700 kg/m-'. Young's modulus 
= 70E9 Pa, Poisson ratio = 0.32), and their diameter is 0.02 m. The minimiza- 
tion of the compliance (37) under the volume constraint (38) leads to the optimal 
truss represented in Fig. 4. After a few iterations, the density variables are either 
close to or significantly larger than zero. It is noticeable that the optimal design, 
which results from a fully automatic optimization procedure, is acceptable for the 
engineering common sense. 



3.2 Topology Optimization of Multibody Systems 

A more challenging problem is the optimization of truss linkages included in multi- 
body systems. In a classical approach, one would reformulate the dynamic problem 
as a set of static problems. In a first step, a rigid multibody software is used in 
order to precompute the loads applied to each component, and in a second step, 
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Fig. 4 Initial and optimal truss for a static tip load 



the topology of each Hnkage is optimized independently using the static approach 
described in the previous section. For this purpose, a set of static load cases should 
be defined in order to mimic the precomputed dynamic loads. 

Alternatively, an integrated optimization approach is proposed here, which is 
based on a dynamic multibody analysis. The advantages are that: (i) the dynamic 
coupling between large overall rigid-body motions and deformations is properly 
taken into account, (ii) a single dynamic analysis is required by the optimizer instead 
of a patchwork of static analyses, (iii) topology-dependent loads can be considered, 
and (iv) the objective function and the design constraints may be defined with re- 
spect to the actual dynamic problem. 

As seen in the previous section, static topology optimization problems are often 
formulated as the minimization of the compliance of the deformed structure. In 
order to extend this approach to dynamic problems, the objective function may be 
defined as the mean compliance over the trajectory 



c = 



f Jo ^ 



tf 



(39) 



where ric represents the number of structural components and C(,) is the instanta- 
neous compliance of component / defined by (37). 

For a robot arm, a task-oriented objective function could be preferred. For exam- 
ple, the mean squared tip-deflection over the trajectory is defined by 



1 r'.f 



1 1 r ^ rigid II ^^ 



(40) 
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where the vector r represents the actual tip position and r„g,v/ is the tip position of the 
undeformed mechanism. Clearly, a reduction of d corresponds to an improvement 
in the trajectory tracking performances. 



4 Example 

4.1 Problem Description 

Figures 5 and 6 represent a robot arm composed of two truss linkages interconnected 
by revolute joints and moving in a horizontal plane, so that gravity can be ignored. 
The parameters of each truss structure were given in Sect. 3.1. Moreover, a non- 
design mass of 5 kg is fixed at the tip of the manipulator. 




0.5 1 

time (s) 



Fig. 5 Kinematic model. Imposed joint trajectory for 6i and ( 



Fig. 6 Initial and final 
configurations 
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Each revolute joint is driven by an ideal motor which imposes a smooth joint 
trajectory 6i (f ) and 02(0- As illustrated in Figs. 5 and 6, a point-to-point trajectory 
has been selected, which is composed of an acceleration phase, a constant speed 
phase, and a deceleration phase. The numerical data are On = Oji = jt/3 rad, 
^1/ = 6*2/ = ^/2 rad, acceleration (deceleration) = 2 rad/s^, velocity = 0.5 rad/s. 

For each linkage separately, an upper bound is imposed to the volume of material 

Ko)<0.4F/„„,(,)' ' = 1.2 (41) 

where Vq) is the volume of linkage i, see (38), and Vjuuxi) is the volume of the 
linkage with full densities. 

The simulation and the sensitivity analysis are realized using the numerical algo- 
rithm described in Sect. 2 with a time step h = 0.02 s. The algorithmic coefficients 
P, y, a/ and am are defined according to the method of Chung and Hulbert [17] 
and the value 0.8 is selected for the spectral radius at infinite frequencies. 



4.2 Optimization 

The minimization of the mean compliance defined by (39) leads to the design re- 
presented in Fig. 7. We observe that the mass densities are reduced as one progresses 
from the ground to the tip, i.e. that densities are lower in areas with large amplitude 
motions. A possible explanation is that the addition of materials in those areas in- 
creases the total amount of kinetic energy in the system, which leads to higher inertia 
forces and vibration excitations. We intend to continue our investigations in order to 
verify this interpretation and to check if this phenomenon is caused by the existence 
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Fig. 7 Minimization of the mean compliance 
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of a local minimum. After seven iterations, the dynamic problem becomes badly 
conditioned and the time integrator is no more able to compute the response of the 
system. 

Since the minimization of the mean compliance does not lead to an acceptable 
design, we have considered the minimization of the mean squared tip-deflection 
defined by (40). Actually, this formulation in terms of trajectory tracking perfor- 
mances is closer to the user expectations, which represents a clear advantage. The 
rigid trajectory r„g,y is computed from the joint trajectory using the rigid kinematic 
relations of the manipulator The resulting optimal design, which is represented in 
Fig. 8, is very attractive from an engineering point of view. Its superiority is further 
demonstrated in Fig. 9, which compares the tip deflections for the initial and opti- 
mized designs. One also observes that the final design is not the same for the two 
structural components, which makes sense since they are subject to different load 
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conditions. This again motivates the interest in the integrated optimization approach 
proposed in this paper, which properly accounts for the dynamic loads during the 
motion of the multibody system. 



5 Conclusions 

This chapter is about the optimization of flexible multibody systems based on their 
dynamic response. The proposed approach involves a nonlinear finite element for- 
mulation, a generalized-a time integrator, a direct differentiation sensitivity analysis 
and an optimization algorithm. 

In particular, the topology optimization of linkages included in multibody sys- 
tems is addressed. The material properties are described by a SIMP model and a 
sequential convex programming optimizer is used. The developments have been im- 
plemented in the OOFELIE simulation package, which is coupled with the CONLIN 
optimizer. 

The methodology has been validated for the design of a two degrees-of-freedom 
robot arm with truss linkages. Due to important inertia effects, the classical com- 
pliance objective function does not yield a satisfactory design. In contrast, a task- 
oriented criterion, such as the mean tip deflection, appears to be a sound basis for 
the optimal design of the two links of the mechanism. 

We conclude that optimization algorithms can rely efficiently on simulation 
tools for flexible multibody systems provided that algorithms for sensitivity anal- 
ysis are implemented in the numerical time integration procedure. In particular, 
the integrated approach proposed in this work opens some perspectives for the 
development of topology optimization techniques for elastic components of mecha- 
nisms. Promising 3D applications can be found in various fields, such as deployable 
space structures, piston engines, automotive suspensions or high-speed robots and 
machine-tools. 
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Real-Time Aeroservoelastic Analysis 

of Wind-Turbines by Free Multibody Software 



Luca Cavagna, Alessandro Fumagalli, Pierangelo Masarati, 
Marco Morandini, and Paolo Mantegazza 



Abstract This work illustrates the feasibility of the implementation of innovative, 
efficient and low-cost solutions for fast-prototyping and customization of controlled 
mechanical and aeroservoelastic systems. A controlled constant-speed wind-turbine 
is considered as an example of the proposed methodologies, where the physical 
aeromechanics problem is controlled by a controller process scheduled for execu- 
tion in real-time on a PC-class computer. The physical problem is simulated by a 
general-purpose multibody process that is scheduled in real-time as well. The pro- 
cesses communicate using real-time inter-process communication primitives. All 
the involved tools are based on free software. 



1 Introduction 

Wind-turbines represent an important means to extract energy from the environment 
in a 'green' manner. The concept of extracting energy from the wind dates back 
thousands of years, including not only power generation (e.g. mills, water pumps) 
but also direct locomotion (e.g. sailing). Modern wind-energy technology relies on 
efficient aerodynamic design and durable mechanical systems. 

Nonetheless, efficient and reliable energy harvesting from winds poses several 
significant challenges, including: 

• optimal harvesting with rather irregular and erratic wind conditions 

• tolerance to wear and fatigue with minimal maintenance 
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• controllability and survivability during exceptional weather and operating 
conditions 

• efficient and fault-tolerant integration with power grids 

It is anticipated that key to many of the issues mentioned above is control. A recent, 
extensive review of the state of the art in wind-turbines aerodynamics and aeroelas- 
ticity is presented in [1]. Smart rotor control research for wind-turbines is presented 
in [2]. 

The goal of this work is to illustrate the rapid feasibility of the implementation of 
innovative, efficient and low-cost solutions for fast-prototyping and customization 
of controlled mechanical and aeroservoelastic systems. The specific problem of de- 
signing control systems for wind turbines is not addressed. The work rather focuses 
on providing analysis tools that can be used for this purpose. 



2 Approach 

The free Real-Time Operating System (RTOS) Real-Time Application Interface 
(RTAI [3]) and the free general-purpose multibody software MBDyn [4], both orig- 
inating from research at the Dipartimento di Ingegneria Aerospaziale of Politecnico 
di Milano, Italy, are at the core of the present work. The use of free software that 
runs on low-cost hardware gives any organization, significantly the academia and 
Small-Medium Enterprises (SMEs), access to powerful and versatile analysis and 
simulation capabilities. 

A controlled constant-speed wind-turbine [5] is considered as one of the pos- 
sible applications of the developed methodologies. Figure 1 shows a sketch of 
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Fig. 1 Sketch of the controlled model 
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the controlled model under analysis [6]. It consists in a multibody model of the 
Controls Advanced Research Turbine (CART), a research wind-turbine in use at the 
National Renewable Energy Laboratory (NREL [7]) for experimental purposes. 

The box containing a picture of the aerogenerator on the left represents the 
multibody model of the wind-turbine, which accounts for its kinematics, structural 
dynamics and aerodynamics. The multibody model outputs the angular speed of the 
shaft, which is fed to the control system. The control modeling environment takes 
care of: 

• generating the gusty airstream input for the multibody analysis 

• determining the generator's torque to be applied to the shaft of the wind-turbine 

• computing the blade pitch input to be used by the multibody model in order to 
control the behavior of the wind-turbine 

The latter item represents the control task. Given the relative simplicity of the 
controller considered in this work, the whole problem could have been mod- 
eled monolithically within the general-purpose multibody simulation environment. 
Nonetheless, the control-related part has been intentionally modeled in a separate 
general-purpose, graphically driven mathematical modeling environment, for the 
sake of generality. This kind of graphical environment represents the natural mod- 
eling environment for control systems, and is offered by many popular software 
packages, e.g. The Mathworks' Matlab/Simulink [8], INRlA's Scilab/Scicos [9, 10], 
Labview [11], and National Instrument's MATRIXx [12]. 

The multibody analysis, instead, represents an effective means to provide the 
virtual simulation of the real process that needs to be controlled. For this reason, 
it only contains the bare physical process, in order to allow to test the real con- 
troller that would be used in a real-world application. The two processes typically 
communicate by means of real-time capable network primitives, or by inter-process 
communication when running on the same computer, on separate CPUs in case of 
SMP architectures. 

There exists a number of software that can be proficiently used to analyze the 
aeroservoelastic behavior of wind-turbines. Some are dedicated to this task, while 
others are general-purpose. A recent survey of some of them is presented in [13]. 
That reference compares software based on accuracy with respect to benchmark 
problems. MBDyn has been coupled to NREL's AeroDyn library in order to exploit 
the availability of a well-proven wind-turbine aerodynamic code [14]. However, 
there are other factors that may come into play, significantly those related to soft- 
ware accessibility, to the capability of modeling problems with an arbitrary level of 
detail, and to fulfill control design requirements. 

The problem of accessibility is addressed by using 'free software'. Problems 
can be analyzed with an arbitrary level of detail when general-purpose software 
is used. Control design requirements, and significantly the capability to perform 
hardware-in-the-loop virtual testing, are met by enabling tight real-time scheduling 
and execution of the simulation and control software. The proposed virtual testing 
environment meets all the requirements illustrated above. 
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The use of general-purpose multibody software typically results in solving larger 
problems, especially when the formulation is based on the redundant coordinate 
approach. This may represent a challenge for real-time simulation; for this reason, 
real-time simulations are often approached reducing the problem to a minimal set 
of coordinates. The redundant coordinate approach, however, usually results in very 
sparse problems. When sparsity is efficiently handled by specialized linear algebra 
solvers, as the one proposed in [15], very good performances can be obtained in 
terms of computational time. A detailed comparison of the effects of different lin- 
ear solvers on the overall efficiency of real-time multibody simulation is presented 
in [16]. Complex mechanical systems, including robots and rotorcraft wind-tunnel 
models, can be simulated in real-time with the desired accuracy using general- 
purpose multibody software, with an acceptable trade-off between model detail and 
real-time implementation [17-19]. This is the case of MBDyn [20,21], the free 
multibody software used and adapted for the purpose of this research. 



3 Wind-Turbine Description 

The CART wind-turbine, shown in Fig. 2, is located at the National Wind 
Technology Center (NWTC) of NREL in Colorado. It is used as state of the art 
test-bench for controls research in wind-engineering [22-25]. The main focus is 
currently on testing control strategies to improve the performances and the handling 
of wind-turbines subjected to exceptional operational conditions. It is an upwind 
machine with a nacelle tilt of 3.8 deg and two teetering blades with zero precone. 



Fig. 2 The Controls 
Advanced Research Turbine 
(CART) 
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The rotor diameter and the hub height are respectively 43.3 and 36.6 m. Power 
energy is rated at 600 kW and generator speed through a gearbox with a ratio of 
43.165 is rated at ISOOrpm. The rotor is thus rated at an angular speed ^lated = 
41.7 rpm. 

This model has been selected because it is described in detail in the above 
mentioned publicly accessible documents. Moreover, since it is characterized by 
a two-blade, teeter rotor, it requires less computational effort than more modern, 
three-blade turbines. As illustrated in the following, this choice was conservative, 
as the proposed analysis leaves margin for further increase in model complexity 
without violating the real-time requirements. 



4 Baseline Controller 

The baseline controller is composed by independent electric torque and collective 
pitch algorithms. Both controllers use the rotor angular speed measurement as sole 
input. 

The task of the control-system is to maximize power capture below, and regulate 
a constant speed above, the rated operating point. Currently, no effort is undertaken 
to regulate the high speed generator shaft brake nor the nacelle yaw (which, in the 
real wind turbine, is limited to only ±0.5 deg and simply used for tracking relatively 
small wind changes). 

Generator commands are calculated by means of a piece-wise function. Below 
the cut-in speed of 10 rpm (Region 1), no electric torque is generated, to let the 
wind accelerate the rotor at maximum angular acceleration. The quadratic region 
(Region 2) is designed to keep the tip-speed ratio at the optimal value for maximum 
power. Above 99% of the rated rotor speed I2rated a constant torque of 3524 Nm is 
required. Between 98% and 99% of ^rated the transition is linear, equivalent to a slip 
of 5% (Region 3) [26-28]. Figure 3a shows the piece-wise working function for the 
electric generator (top), and its block-diagram model (bottom). 

The full-span collective blade pitch angle commands are computed by means of a 
PID controller on the error of the rated angular speed, with saturation on the integral 
term to limit wind-up. Special care is taken to avoid working in post-stall regions 
during the initial acceleration phase. This allows to use the same controller for both 
rotor start-up and speed control at the rated speed. 

Figure 4 shows the controlled system in the typical Scicos/Simulink/MATRIXx 
environment. The box labeled 'from MBDyn' represents the output of the multibody 
model that is input in the control system: the basic control system considered in this 
work requires only a measure of the angular velocity of the rotor. The box labeled 
'to MBDyn' represents the inputs to the multibody model: the free-stream wind 
velocity, the torque absorbed by the electric generator and the desired blade pitch. 

The controlled CART model can be run either in batch or real-time mode. In 
the latter case, the real-time scheduling can be delegated to the operating system 
using a standard POSIX interface, or tightly enforced using RTAI. When executing 
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Block-diagram. 

Fig. 3 Electric generator piecewise working function and block-diagram model 



in batch mode, a generic interface between MBDyn and the external control 
software, based on standard UNIX inter-process communication primitives (lo- 
cal and TCP/IP sockets), is used. When executing in hard real-time mode, the 
controller code is automatically generated by any of Scicos, Simulink or MATRIXx 
from the very same model, and run by the RTAI operating system. In this case, 
MBDyn is scheduled in real-time by RTAI as well, to emulate the real wind-turbine. 
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Fig. 4 Controlled system in the typical Scicos/Simulink/MATRLXx environment 



In real time mode, the RTAILab graphical user interface [29] can be used to 
monitor the controller, either locally or remotely, and to tune gains and other system 
parameters on the fly, as shown later in an example. 



5 Multibody Model 

The multibody approach is definitely suited for the analysis of complex multidisci- 
plinary systems where exact mechanism kinematics, nonlinear structural dynamics, 
and arbitrary control-related components need to be simultaneously analyzed [30, 
31]. Thanks to its versatility and availability, the multibody formalism proposed in 
this work is being used in many fields related to aeroservoelasticity. 

The analysis is based on an original formulation, implemented in the free general- 
purpose software MBDyn [4]. It performs the direct time integration of Initial Value 
Problems (IVP) written as a system of first-order Differential-Algebraic Equations 
(DAE), using implicit A/L stable integration algorithms [32]. 



5.7 Unconstrained Dynamics 



The equations of motion of each unconstrained body are written in first order form 
using the Newton-Euler approach. The definitions of momentum, Pi , and momenta 
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moment about the node, yx, , for the /th node are 

ifiiXi + (tii X s,x, = P (la) 

s,x,- X X,: + J,x,ai/ = y,x,- (lb) 

where m, is the mass of the body connected to the /th node, x, is the location of the 
node, o), is the angular velocity of the node, s^x, is the static moment of the body 
referred to the node's location, and J,x, is the inertia tensor of the body referred to 
the node's location. The equilibrium of each node yields 

^i=Y.^i (2a) 

y,x, + X,- X Pi = ^m,x, (2b) 

where all external forces, f,-, and moments, in,x, , acting on the node are considered. 
The external forces and moments can arbitrarily depend on the motion of all nodes 
the /th one is connected to. The equations of motion of all the unconstrained nodes 
can be summarized as 

Mq = p (3a) 

p = f(q,q,p,0 (3b) 

where q e R" summarizes the kinematic variables of the nodes (n corresponds 
to 6 times the number of bodies, n\,), while p e M" summarizes the momentum 
and momenta moments. The function f : R^"+^ h^ M" represents the generic 
configuration-dependent forces acting on the nodes. It includes the contributions 
related to structural deformability. 



5.2 Constrained Dynamics 

The constrained system dynamics are modeled by explicitly adding kinematic 
constraints between the nodes in form of algebraic equations, using Lagrange's 
multipliers formalism. The addition of Wh holonomic and rrinh non-holonomic con- 
straints, respectively expressed by i^ (q, f ) = ; R""*"^ i-^ R'"'' and i/f (q. q, t) = 
0:M2"+i h^ M'""\ results in 

Mq = p (4a) 

P + -/-/V + ^/V = f ('I' ^' P' '^ (4b) 

'^(q,0 = (4c) 

f{q,q,t) = (4d) 

where A e W"^ and ji e W""^ respectively are the multipliers related to the holo- 
nomic and non-holonomic constraints. 
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5.3 Structural Flexibility 

The modeling of structural flexibility is fundamental for accurate aeroelastic 
analysis. However, accuracy may require a considerable number of degrees of 
freedom. As a consequence, accuracy may need to be traded for efficiency, espe- 
cially when real-time simulation is considered. 

Conventional horizontal axis wind turbines are characterized by a slender tower 
and very slender blades. The slenderness of current large size turbines will probably 
increase further as the size grows from 2 -^ 5MW on. 

Historically, structural flexibility has been considered in multibody dynamics us- 
ing lumped components first. Eventually, the need to bring the level of detail of 
finite elements led to combining the arbitrary reference rigid body motion peculiar 
of multibody dynamics with small perturbed deformation given by linear finite ele- 
ments into the so-called floating frame approach [33]. In this case, a Ritz-like linear 
combination of deformation shapes is used to express a deformation with respect 
to a reference frame that undergoes arbitrary motion. This approach, in the case of 
wind turbines, suffers from the fact that an accurate basis consisting in normal vi- 
bration modes may require a significant number of degrees of freedom, since the 
normal modes are considerably influenced by the centrifugal stiffening [34]. This 
may not be an issue for systems rotating at constant angular velocity, but in general 
wind turbines can operate at an arbitrary velocity, and transient analysis capability 
is essential. 

Accurate modeling of structural components can be achieved using finite el- 
ements directly in the multibody model [35]. The behavior of slender structural 
components can be efficiently described by the beam model. In many cases the 
beam model is fairly accurate and at the same time synthetic; thus, it leads to ef- 
ficient models, allowing to meet real-time simulation requirements. In this work, 
an original, geometrically exact, composite-oriented beam formulation based on a 
finite-volume approach is used [36]. The beam model in the multibody analysis 
takes care of the one-dimensional flexibility of slender structural components. In 
order to give accurate results, it requires a correct and accurate characterization of 
the cross-section inertial and structural properties. In the present work, this pre- 
processing step is based on the section characterization procedure first proposed 
by Giavotto et al. [37], that allows to characterize the 6x6 stiffness matrix of a 
generalized Timoshenko beam. A detailed review of different beam section char- 
acterization procedures is presented in [38]. Other formulations, including the one 
proposed in [39], are compared to standard wind turbine blade characterization ap- 
proaches in [40] . 



5.4 Numerical Integration 

The implicit DAE problem of (4) can be written in the generic form 

g(y,y,0 = 0, (5) 
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where y = {q; p; A; /x} summarizes all the variables of (4). Its solution at the generic 
time step t^, using a generic implicit multistep integration scheme, requires to solve 
(5)fory^, with 

yk= ^ ar^k-r +^ X! ^'^k-s- (6) 

r = l,n s=0,n 

The coefficients a^ and bs characterize the numerical integration method; bo j^ for 
implicit schemes. Equation (5) is solved using a Newton-Raphson scheme, namely 

g/ySifk + S/ySyk = -g- (7) 

According to (6), 8yj^ = hboSyj^; as a consequence, (7) yields 

(g/y + hbog/y)Syk = -S- (8) 

The linear problem of (8) needs to be solved iteratively to convergence. The DAE 
nature of the problem implies that either of matrices g,j, g/j, can be structurally 
singular, or both. However, when the problem is well posed, the matrix pencil (g ;^, + 
Ag/j,) is not structurally singular; thus, (8) can be solved [41]. 

MBDyn is mainly used by its developers to model the aeroservoelasticity of ro- 
tary wing aircraft (e.g. [42]). Analysis of wind-turbine systems is carried out by 
some of the independent users [14,43]. 



5.5 CART Wind Turbine Multibody Model 

The multibody model considered in this work consists in: 

• a deformable tower, made of five three-node finite-volume beam elements, 
clamped to the ground at the lower extremity 

• a rigid nacelle, connected to the tower by a yaw hinge, with built-in pitch; the 
yaw degree of freedom in the analysis is restrained by a very stiff spring, since 
no yaw control is considered 

• a rigid low-speed shaft, connected to the nacelle by ideal bearings; the rotational 
inertia includes that of the high-speed shaft, accounting for the low- to high-speed 
shaft gear ratio 

• an ideal generator, consisting in an internal torque applied between the nacelle 
and the low-speed shaft, whose value is computed by the control task 

• a rigid body that models the teetering hub, connected to the low-speed shaft by 
an ideal teeter joint 

• a pair of deformable blades, modeled with 5 three-node finite-volume beam el- 
ements each, including blade element aerodynamics coupled to an induced flow 
model; each blade is hinged to the teetering body by means of a revolute hinge 
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Table 1 Summary of CART model 



Component 



Nodes 



oints Bodies 



Beams 



Aero 



Forces 



DoFs 



Tower 


11 


1 


10 


5 




Nacelle 


1 


2 


1 






Shaft 


1 


1 


1 






Generator 












Teeter 


1 


1 


1 






Blades, 2x 


11 


1 


11 


5 


5 


Total 


36 


7 


35 


15 


10 



138 
17 
17 

17 
138 

465 



Fig. 5 Non-zero coefficients 
of the CART model matrix; 
fill-in is about 3% 
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that allows to impose the pitch angle. Although the pitch of each blade can be 
controlled independently, in this study the same value is applied for simplicity. 

Table 1 summarizes the number of nodes, elements and degrees of freedom of 
each component of the model. It results in a total of 465 equations: 432 related to the 
dynamics of the nodes, and the remaining 33 related to holonomic constraints. The 
typical sparsity pattern of the matrix pencil of (8) is shown in Fig. 5. The number 
of non-zero coefficients is 6351, which implies a fill-in less than 3%. The matrix is 
intrinsically non-symmetric. 

The blade pitch is controlled by simultaneously rotating the blades at the root 
node by an amount that is determined by the controller task. An angular velocity 
sensor measures the low-speed shaft velocity and feeds it into the control task. 

The wind-turbine pictures in Fig. 6 are generated by an enhanced version of the 
free visualization software EasyAnim [44] (the modified version is available in [4]). 

The structural model of the tower and of the blades is likely too refined, con- 
sidering the very low rotational velocity and the bandwidth of interest. Figure 7a 
illustrates the convergence on the frequency of the first 10 modes of the tower 
plus nacelle model as the discretization is refined from 1 to 10 three-node beam 
elements. Figure 7b refers to the entire model at null angular velocity, with the 
blades in the vertical position. Similar trends are shown by all subcomponents. 
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Fig. 6 Graphical representation of the CART model 
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Fig. 7 CART model convergence on the first normal modes (non-rotating) 

Five beam elements for the tower and five for each blade have been used because, 
as shown in Fig. 7, they yield a dynamically well converged model, while allowing 
to meet the real-time execution constraint. 



6 Real-Time Simulation 



The proposed multibody analysis runs in real-time thanks to RTAI support, built-in 
in MBDyn when running on Linux [17,45,46]. 

Popular graphical tools for computer-assisted control system design and fast 
prototyping, with automatic control code generation, like Scicos, Simulink, and 
MATRIXx, have been extended to support the generation of the controller source 
code in the C programming language, using RTAI's primitives for real-time schedul- 
ing and inter-process communication. 
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As a result, two real-time processes, one simulating the physical system and the 
other performing the appropriate control task, are executed on computers running 
the RTAl real-time extension for Linux. 

The RTOS takes care of scheduling both processes with the desired periodic- 
ity. The processes typically communicate a set of measurements from MBDyn to 
the controller, and a set of control inputs from the controller to MBDyn. Inter- 
process communication uses RTAI mailboxes, a primitive that transparently uses 
shared memory when both tasks are executed on the same machine, or the UDP 
real-time support, provided by the NetRPC extension, when tasks are distributed on 
different machines. 

In more sophisticated applications, a real-time instance of MBDyn itself can be 
embedded in the controller. In those cases, it is used to determine the control in- 
puts required by the controlled process, which in turn can be a real or a simulated 
process [47]. 

Two scheduling approaches can be followed. In one case, the processes synchro- 
nize with each other by using RTAI semaphores. One process, usually the controller, 
is scheduled periodically. As soon as it sends the control input to the simulator, the 
simulator is woken up and starts simulating the time step. This approach guarantees 
that the subsequent time step receives the expected control input. In the other case, 
the two processes are independently scheduled periodically. Each process reads in- 
puts and writes outputs according to its schedule. There is no strict guarantee that 
each process receives exactly the expected input. However, the error can be at most 
one sample period, and thus is treated as a disturbance. 

The simulation must behave in a quasi-deterministic manner or, in other words, 
each sample interval needs to be completed within a given number of operations. 
This is not guaranteed when iteratively solving a nonlinear problem. In order to ob- 
tain a quasi-deterministic behavior, RT-MBDyn solves the nonlinear problem within 
up to a fixed number of iterations, using a modified Newton-Raphson scheme that 
consists in assembling and factorizing the matrix only at the first iteration of each 
time step. Errors due to lack of convergence to the desired accuracy can be reason- 
ably assumed to be small after few iterations, thanks to the superlinear convergence 
properties of the modified Newton-Raphson scheme, provided the prediction at each 
time step is close enough to the actual solution. These errors can be treated as distur- 
bances by the control scheme. Sporadic overruns can be accepted as disturbances, 
provided subsequent steps can "catch up" with the controller. 

Figure 8 shows a fairly broad layout of the real-time simulation setup, where 
the simulation and the controller are located on different computers connected by a 
hard real-time network via NetRPC, while multiple supervising stations monitor the 
output of the controller and of the simulation using soft real-time connections, with 
the possibility of optionally modifying the controller's parameters. 

Figure 9 shows the result in terms of rotor angular speed and pitch command of 
a simulation in correspondence of a growing wind speed rated at an average level of 
12 m/s. In detail, Fig. 9a refers to a random disturbance of 20% of the wind velocity 
magnitude, while Fig. 9b refers to a sinusoidal disturbance whose amplitude is 5% 
the wind velocity magnitude. In both cases, the resulting error on the final rated 
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Fig. 8 Sketch of a generic distributed real-time simulation layout 




Random gust. Sinusoidal gust. 

Fig. 9 Rotor angular speed and pitch command for wind-up and gusty wind 



angular speed is less than 0.5%. The sample rate is 100 Hz. Figure 10 shows the 
output of the controlled CART model within the RTAILab environment. The field 
labeled as 'Gain' in the top left portion of the control panel allows to change the 
controller's parameters while the simulation is running in real-time. The control 
panel can be configured to allow access to any of the control parameters that are 
exported when the control system is designed. 

The numerical simulations have been performed on a Dual Core AMD Opteron 
Processor 280 (1 GHz). In all cases the multibody model could be executed well 
within the required sample rate of 100 Hz. This leaves room for further model 
refinement, e.g. an increase of the sampling rate, or the analysis of more complex 
turbines, e.g. three-bladed. 
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Fig. 10 Output of the controlled CART model within the RTAILab environment 



One of the distinguishing features of the proposed approach is that information 
related to distributed structural flexibility can be simulated and monitored in real- 
time. This paves the way to simulating in real-time the control of strains, stresses, 
and gust load reduction in general. 



7 Conclusions 



This work illustrates the implementation of what can essentially be considered a 
test bench to prove the feasibility of innovative, efficient and low-cost solutions for 
fast-prototyping and customization of controlled mechanical and aeroservoelastic 
systems. The proposed environment has been applied to the development and testing 
of a simple controller for wind-turbines. Further details can be added, both in the 
simulated physical process, to enhance system modeling with features that have 
not been considered so far in this work, and in the controller, to investigate more 
sophisticated control strategies. 
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8 Additional Material 

The analysis was performed using RTAI 3.6.1, available for download at [3], and 
MBDyn 1.3.3, available for download at [4]. The wind-turbine models and the con- 
troller source code are available at [4], in the RT-MBDyn^-wind turbine folder. 
Feedback using the mailing lists rtai@rtai.org and mbdyn-users@mbdyn.org is ap- 
preciated. 

Acknowledgements This work has been partially funded by 'SI PARTE!', a R&D project of 
'Regione Lombardia' that addressed the development of innovative solutions for Embedded Real- 
Time Applications. 
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Comparison of Planar Structural Elements 

for Multibody Systems with Large Deformations 



Markus Dibold and Johannes Gerstmayr 



Abstract In the field of multibody dynamics, structural components, such as beams 
or plates, have been discretized in different ways, according to special requirements 
of certain problem configurations. In literature, models, which follow the same me- 
chanical theories but a different numerical discretization technique, such as the 
absolute nodal coordinate formulation (ANCF) and the floating frame of reference 
formulation (FFRF), have been calculated for comparison. In existing examples, the 
solutions of these calculations do not always coincide very accurately. Therefore, 
in the present contribution, which is an extension of a former work of the authors, 
standard static and dynamic problems in the large deformation regime are treated. 
Special emphasis is laid on converged solutions, using an analytical reference value 
in the static case. For dynamic examples a reference value based on the strain energy 
is provided, in order to simplify the comparison of the different formulations and to 
provide a reference value, similar to the static case, for future studies. For both for- 
mulations planar finite elements based on the Bernoulli-Euler theory are utilized. 
In case of the ANCF the finite element consists of two position and two slope co- 
ordinates in each node only. In the FFRF beam finite element, as usual, two sets of 
coordinates are used to describe the actual configuration. The first set of coordinates 
defines the location and orientation of the body reference frame. The second set of 
coordinates describes small superimposed transverse and axial deflections relative 
to the body frame. The transverse deflections are approximated by means of two 
static modes for the rotation at the boundary and a user-defined number of eigen- 
modes of the clamped-clamped beam. The axial deflection is represented by a linear 
approach. In numerical studies, the accuracy of the two formulations is compared 
for two example problems, a cantilever beam with a singular force at the free end 
and a slider-crank mechanism. It turns out that both formulations have comparable 
performance and that the results coincide in the converged case. 
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1 Introduction 

For the numerical simulation of beam-type flexible multibody systems several 
methods have been established. Traditionally, these simulations are performed with 
a floating frame of reference formulation (FFRF). In this formulation the position or 
displacement of a point of a body is decoupled into a small (flexible) deformation, 
which is described in a local body-fixed coordinate system, and a possibly large su- 
perimposed rigid body motion. The relative displacements are described by means 
of a Ritz approximation, where the space-wise discretization is performed with 
eigenmodes or polynomials, see [1,2]. The use of relative displacement kinemat- 
ics consequently leads to a non-constant mass matrix and centrifugal and Coriolis 
terms, represented by the quadratic velocity vector. The virtual work of internal 
forces leads to a constant stiffness matrix in the case of small deformations. The 
numerical solution of these problems can be obtained using well-known numerical 
integration techniques, [3]. If geometric or material nonlinearities occur, the stiff- 
ness matrix becomes non-constant. The computational efficiency can be improved 
for systems with non-constant mass matrices using implicit time integration meth- 
ods and simplified Jacobians [4]. 

For large deformation problems a pioneering paper was written by Simo and 
Vu-Quoc [5], where a fixed frame of reference was used, and the so-called geomet- 
rically exact formulation of the strain energy of the beam finite element was based 
on classical nonlinear rod theories. Neglecting shear deformation, this theory coin- 
cides with the strain definitions in the Euler elastica [6]. The numerical treatment 
of geometrically exact beam elements is based on discretization of the position of 
the beam centerline and rotation of the cross section. The strain energy can be in- 
terpreted with the help of a co-rotational formulation of the strain. Alternatively, in 
the absolute nodal coordinate formulation (ANCF), which is a large deformation 
finite element formulation, the deformation of the beam is interpolated on basis of 
position and slope degrees of freedom in the nodes. The use of slopes instead of 
rotational parameters still allows the exact representation of rotational inertia of a 
rigid body [7]. Originally, fully parameterized ANCF finite elements showed var- 
ious types of locking, such as Poisson locking or combined shear and thickness 
locking. These effects could be resolved, and the solution of ANCF finite elements 
now leads to good agreement with the solution of fully three-dimensional compu- 
tations with solid finite elements, see [8]. An important advantage of the absolute 
nodal coordinate formulation is the constant mass matrix. The elastic forces are non- 
linear and usually derived from the deformation field taking into account the fully 
nonlinear Green strain tensor and the second Piola-Kirchhoff stress tensor [9]. As 
an alternative to fully parameterized ANCF elements, simplified ANCF elements 
have been developed, which only take into account the deformation of the elastic 
line of the beam element. This formulation is computationally more efficient, but 
less general. When restricting e.g. to the Bernoulli-Euler case, only nodal positions 
and axial slopes are utilized as degrees of freedom. 

The present investigation is an extension of a former work of the authors [10], 
where nonlinear beam elements based on the fundamentally different approaches of 
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the FFRP and ANCF are compared for example problems in the large deformation 
regime. Both types of elements include axial deformation and bending. In case of 
the ANCF the deformations are intrinsic coupled, whereas in the FFRF they are un- 
coupled. To oppose the different formulations, a brief introduction to each of them 
is given. A comparison of plane static and dynamic example problems of the cor- 
responding beam finite elements is performed and characteristic results are shown. 
The convergence of the results is studied with respect to significant values, in or- 
der to provide reference values for future studies. It is shown that the results of the 
FFRF and the ANCF coincide in the converged case, although the studied methods 
have a completely different numerical discretization. 



2 Floating Frame of Reference Formulation 

In the floating frame of reference formulation two sets of coordinates are used to 
describe the configuration of a flexible body [11]. The first set of coordinates defines 
the origin and the orientation of a reference frame with respect to the inertial frame. 
In the present work the reference frame is fixed to the flexible body such that both 
ends of the beam are located at the x-axis of the body frame. The body reference 
represents (possibly large) rigid body translation and rotation. The second set of 
coordinates describes small elastic deformations relative to the reference frame. 

In the present contribution we restrict ourselves to the plane case, where each 
structure is represented by a specific number of beam finite elements. In the follow- 
ing, general assumptions are made for one finite element. The configuration of this 
element is shown in Fig. 1 . 

The origin of the reference (x, y) coordinate system is given by the position 
vector To with respect to the inertial (X, Y) coordinate system. The orientation is 



Fig. 1 FFRF model of a 
finite element 




actual configuration 



undeformed reference 
configuration 
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determined by (p, the corresponding angle about the axis of rotation. It is notewor- 
thy that the center of mass in the undeformed beam element is at the origin of the 
reference frame, which leads to simpler expressions in the kinetic energy. The fol- 
lowing derivation is provided according to the widely known formulation as given 
in the monograph of Shabana [11]. Additionally, special emphasis is laid on the 
formulation of the specific element. 

The elastic deformations of the finite element are modeled by means of the 
Bernoulli-Euler beam theory, including bending and axial deformation. The posi- 
tion vector of a point P at the deformed axis of the finite element is defined according 
to Fig. 1 , 

r^ = ro -F A u (1) 

wherero = [rox roy]^. InEq.(l)urepresents the position ofpointPin the floating 
reference frame, given by 



Uo +U/ 



+ 



u{x, t) 
w{x, t)_ 



(2) 



including the axial coordinate x e [— L/2..L/2] and the axial and transverse de- 
formation m(x, t) and w(x, t). The position vector u is transformed into the global 
coordinate system by means of the rotation matrix A, which only depends on the 
rigid body angle (p. The velocity vector is computed from Eq. (1) 

Tp = To + A u -I- A u (3) 

where () = d/dt denotes the derivative with respect to time. The space- wise 
discretization is performed with a Ritz approach. This leads to the following ap- 
proximations of the deflections 

u{x, t) = Y!".^^ si ix)q'^ (t) and w(x, t) = J^'^^, ^l ix)q'^+''+\t) (4) 

with shape functions S/(x) and S^^,(x) and time-dependent elastic coordinates 
q''f{t). In the following, a matrix notation for u/ will be derived, in order to get 
a readable formulation of Eq. (2), when inserting the approximations of the defor- 
mations. Therefore, the shape functions S^ {x) and Sl {x) are collected in two row 
vectors Si(x) and SiCx) of length (A'^ -I- M), such that 



Si(x) = [si ... 5,f ... q\ = [sI ... Sf'+^J 
S2(x) = [o ... Sl ... S,f] = [si ... 5^^+^] 



(5) 



In Eq. (5) it is assumed that the shape functions to describe axial and transversal 
deflections are not intrinsic coupled. The time dependant elastic coordinates q^'r(t) 
consequently are collected in a column vector of the same length ( A^ -I- M ) . Inserting 
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Eq. (5) into Eq. (4) and the resulting formulations into Eq. (2) leads to a matrix 
notation for u/, where 



u/ 



u(x, t) 
w(x, t) 



'Si(x) 



q/(f)=S(x)q/(0, 



(6) 



Since dynamics in axial direction include significantly higher frequencies than in 
transverse direction, the shape functions Sy,k < A^, are derived from a linear ap- 
proximation of the axial deformation, 



Sl(x) = ^0 + ■^1-^' 



(7) 



which leads to A^ = 1 . This linear approximation of the axial deformation still gives 
satisfying results for the calculation of the axial deformation. Inserting the boundary 
conditions 



>i(-^ = -y)=0and5i^x = |^ 



= 1 



into Eq. (8) leads to the corresponding shape function 

, 1 X 
1 2 L 



(8) 



(9) 



The shape functions to describe the transversal deflection are approximated accord- 
ing to the reduction methods, see [12]. In this field, structural elements are described 
by means of the rotation at the boundary and a finite number of shape functions, 
which usually are found from the corresponding frequency equation. In the present 
case the static modes for the rotation at the boundary are approximated by a cubic 
polynomial, which leads to 



5,^+1 



X x^ + ^rx^ andS^"*"^ = x-\ x^ + — 



2L 



2L 



■x\ (10) 



The shape functions are given by a user-defined number of eigenmodes, s, of the 
clamped-clamped beam. These eigenmodes, S^, / = l..s, follow from the partial 
differential equations of the beam finite element with 



Siix) = (cosh(^) - cos(^)) - a (sinh(^) - sin(^)) , 
were the following parameters have to be taken into account 



cosh(L/A') - cos(L/A') ^ (x + L/2) 
a = , . . . , , : — , . . . , , . ^ = — and 



sinh(L/A')-sin(L/A') 



A' 



(11) 



(12) 
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Fig. 2 (a) Static modes and (b) first two eigenmodes of tlie FFRF finite element 



A' , which has to be solved from 

cosh(L/A')cos(L/A') = 1. 



(13) 



The user-defined eigenmodes are finally included into the row vector of shape func- 
tions S2 such that 

5f +2+' = S\, (14) 

and M = 2 + s. The static modes and the first two eigenmodes to approximate 
transversal deflections are shown in Fig. 2 respectively. 

Due to the restriction to plane problems, the rotation matrix depends only on the 
rigid body angle cp. Therefore, the expression Au of Eq. (3) can be rewritten by 



Au = ijoA^u = (pB, 
with the abbreviations A^ = dA/d(p and 



B 



— smcp — cos (p 
cos (p — sin I/) 



X + u{x, t) 
w{x,t) 



(15) 



(16) 



such that a matrix notation of Eq. (3) can be found by 



rp = [I B AS] 



ro 
LQ/J 



Lq. 



(17) 



In Eq. (17) I describes a 2 x 2 identity matrix, q is the vector of generalized coordi- 
nates of one finite element and L = [I B AS]. 

The equations of motion are usually derived from Lagrange's equations. 



d_(dT\' _ 



^ (dTY fscV 



uj nsfj ^=« 



(18) 
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where the kinetic energy T , internal and external forces Q = Q/ + Qe, as well 
as possible Lagrange multipliers X due to constraint equations C = need to be 
considered. The kinetic energy for the beam finite element is given by 



-pA 



L/2 



dx 



(19) 



-L/2 



where p is the mass density and A is the cross-section of the beam element. By 
means of Eq. (17), the kinetic energy reads 



/ 



r = -q^ 



pA 



V 



L/2 



\ 



Lfdx 



-L/2 



/ 



q=2'i 



Mq 



(20) 



including the mass matrix M of a beam element. The single components of the mass 
matrix can be found by means of the shape functions and the generalized coordinates 
of the FFRF beam finite element, where all constant terms are pre-computed during 
the initialization of the finite element. For an extensive derivation of the single parts 
of the mass matrix see [13]. Evaluation of terms of Lagrange's equations, which 
contain kinetic energy leads to 



d_ 
dt 






Mq + Mq 



1 



dq 



(q^Mq) 



= Mq-Q,. (21) 



In Eq. (21) the so called quadratic velocity vector Qv, see Shabana et al. [7], is 



given as 



Q. = -Mq + 



1 



_9_ 

aq 



(q^Mq) 



The virtual work of internal forces in its general form reads, 

SU = f o^SedV 

V 



(22) 



(23) 



where 0, is a vector gathering the six components of the stress tensor, and £ is the 
corresponding vector of strain components. In case of a BernouUi-Euler beam only 
the axial strain component £xx is taken into account. According to [14], the axial 
strain can be approximated by 



e.x = u'-yw" = (S; - yS'^) q/ = S q/ 



(24) 



in which ( )' = d/dx and ( )" = d^/dx'^ denote differentiations with respect to the 
local coordinate x and the row vector S = (S\ — y Sj)- Including Young's modulus 
E, the strain-stress relation is 



<yxx = E Ex 



(25) 
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Inserting Eqs. (24) and (25) into Eq. (23), the variation of the strain energy can be 
written 



(/ 



SU =q^\ / S^SJF Sqf =q^K//3q/, 



(26) 



in which the stiffness matrix K^ associated with the elastic coordinates of the finite 
element can be identified. Rewriting Eq. (26) by means of the element coordinates 
leads to _ 

"0 " 



3C/ = q^ 




Kff 



8q = q^KS q = Q/ Sq 



(27) 



in which K is the stiffness matrix of one beam element. 

The virtual work of the external forces, SWe = Qj Sq, leads to the vector of gen- 
eralized forces Qe- In order to include boundary conditions for a finite element or to 
connect finite elements, constraint equations are taken into account. These equations 
are conventionally included, utilizing a system of nonlinear algebraic equations, 
which depends on the generalized coordinates. 



C (q. = 



(28) 



in which C is the vector of linearly independent constraint functions. This leads to 
an extension of the overall system of equations according to the number of finite 
elements. These additional equations could be avoided e.g. by a coordinate trans- 
formation to minimal coordinates. This would lead to a set of equations, where 
the degrees of freedom, which are relevant for e.g. robotics or control design, 
are directly available. However, the derivations to get this set of equations have 
to be performed symbolically for most of the applications. Furthermore, in the 
present contribution a sparse solver is used within the numerical time integration 
and therefore no significant computational costs result from the additional constraint 
equations. On the other hand, the present method provides a more general formula- 
tion of the overall system. 

As an example for constraint equations, the functions Ci to constrain the posi- 
tions, and the function C2 to constrain the angles of the elements i and j are shown 
in Eq. (29), 



(c:)= 



/ 


r'(x' 


= L',t)-r'(xJ =0J) 


\ 


/■ 


3w'(x',0 
dx' 


: dwHxKt) 


^'^% 



(29) 



Together with the constraint equations, see Eq. (28), the Lagrange equations, see 
Eq. (18), follow with 

Mq + Kq + C[X = Q+Q, 
C(q, f ) = 

In Eq. (30) Cq denotes the Jacobian of the constraint equations. 



(30) 



Comparison of Planar Elements with Large Deformations 

3 Absolute Nodal Coordinate Formulation 



95 



In the following, the planar ANCF element as derived by Berzeri and Shabana [15] 
is shortly revised. Note that symbols are introduced in this section which previously 
had a different meaning. The ANCF element has only eight degrees of freedom 
(DOF), four in each of the two nodes. Figure 3 shows the undeformed and the de- 
formed configuration of the ANCF element. The vector r denotes the actual position 
of a point on the deformed beam, which originally was located at the position x on 
the undeformed beam element. The nodal coordinates consist of two position DOF 
and two DOF for the directional derivative with respect to the x coordinate, also de- 
noted as x-slope. The derivatives are abbreviated as r' = dr/dx and r" = S^r/Sx'^. 

The advantage of this formulation compared to fully parameterized elements 
[16], which use all components of the gradient at a node, is the good convergence 
and small computational costs. The neglection of shear deformation is applicable in 
very thin and cable-like structures, which often appear in multibody systems. 

The nodal coordinates at x = and x = L of an element are defined as 



qv = [rj r-]' 



and the element coordinates are 

q = [qf ql] 

The actual position of the point r is given by 



(31) 



(32) 



r = 



bo + bix + bzx^ + bjx^ 



(33) 



y^ 



Fig. 3 ANCF model of a 
finite element 



deformed configuration 




undeformed reference configuration 



X 
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Note that this displacement field is a cubic polynomial in x. The coefficients a, and 
bi are chosen such that the displacement field fulfills the requirements of Eqs. (31) 
and (32) at the nodes, at position x = and x = L. This leads to 

r=[SiI 52l S3I S4l]q = Sq. (34) 

The shape functions are given by 






(35) 



with the normalization ^ = 2x/ L — l.A comparison of the shape functions of the 
ANCF element of Eq. (35) and those of the FFRF element of Eqs. (10) and (1 1) 
shows that within the ANCF element for both, the axial and the transverse deflec- 
tion the same polynomial is used. This leads to a total number of eight degrees 
of freedom, three of which take into account rigid body translations. Two shape 
functions describe the transverse deflection including the rotation at the boundary. 
The remaining three shape functions consequently take into account axial defor- 
mations, which therefore are approximated with cubic order. In comparison to the 
FFRF element, which takes into account axial deformation with a linear order of 
approximation, the equations of the ANCF element become much stiffer. However, 
within the FFRF element the transverse deflections are approximated with s + 2 
shape functions, where s e [1. . .4], such that the system of equations of the FFRF 
element becomes larger. 

The mass matrix can be computed via integration over the x-axis, rewriting the 
shape functions with respect to the x-coordinate. 



= AfpS^ 



M = A pS'Sdx (36) 



in which A denotes the cross-section area, L is the length and p is the density of 
the beam element. The mass matrix is integrated exactly by means of Gaussian 
quadrature. The computation of the elastic forces, which result from the virtual work 
of internal forces, is split into a part due to bending, based on the curvature of the 
deformed beam axis and a part due to axial strain. Usually in the ANCF, the axial 
strain is computed using the x-component of the Green strain tensor, 

e°, = i(r'^r'-l) (37) 
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Due to compatibility reasons with classical finite beam elements, which are based 
on the geometrically exact solution of the beam centreline, the so called extensible 
elastica, or the shear deformable element provided by Simo and Vu-Quoc, in the 
present case a strain based on the axial stretch, 

eo=|r'|-l, (38) 

is used, compare e.g. the work of Dmitrochenko and Pogorelov [17]. 

The curvature of the beam centerline, which can be related to the curvature de- 
fined in the Frenet-frame, follows by 



r' X r" 
K = ' ^ (39) 



where k is the absolute value of the curvature depending on a generalized parame- 
ter X. Using the Bernoulli-Euler beam theory, the virtual work of the elastic forces 
including axial deformation can be written as 



1-, 
W = - j (EIK^ + EAsl) dx 



(40) 



where K is the rate of rotation of the cross section with respect to the undeformed 
beam length, sometimes denoted as the "material form of curvature", compare 
Reissner [18] and Gerstmayr and Matikainen [8]. This parameter is related to the 
rotation of the cross section cp, respectively the spatial form of curvature k by 

^ = / = ^ (41) 

ax \r'\ 

and leads to the exact solution for large deformation problems according to extensi- 
ble elastica theories. In the implementation, it is possible to derive the integrand of 
Eq. (40) analytically in order to minimize the number of mathematical operations. 
The non-rational expressions of the integral in Eq. (40) can be well approximated by 
means of Gaussian integration. It turned out that five integration points for the axial 
strain components and three integration points for the curvature (bending) compo- 
nents are sufficient for accurate approximations of the elastic forces. The elastic 
forces can be derived from Eq. (40) by means of differentiation, 

L 



K = ^= f (eIK^ + EAso^dx I (42) 



9q y V 3q ' " 9q 





The equations of motion are gained from D' Alembert's principle and are written for 
one element 

Mq + Fei(q) -C\\ = F^xt (43) 
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in which external forces Fext and the Jacobian C,^ of the vector of constraints C are 
considered. Compared to the equations of motion of one finite element described by 
the FFR formulation in Eq. (30) of the previous section, a constant mass-matrix is 
derived and centrifugal and Coriolis terms do not come into the play. However, the 
vector of elastic forces is nonlinear in this case, which leads to a nonlinear stiffness 
matrix. 



4 Numerical Examples 

In the following, large deformation static and dynamic example problems are pre- 
sented. The investigated mechanical systems are modeled with ANCF and FFRF 
beam finite elements for comparison. In case of static computations, the conver- 
gence rate is investigated and the results are compared to an analytical solution of 
the extensible Euler elastica, which is provided according to the work of Gerstmayr 
and Irschik [9]. 

In case of dynamic computations, a reference value based on the strain energy of 
the finite elements is provided. This reference value allows a simplified comparison 
of different formulations and can act as reference solution for future studies. 

All computations are performed within the multibody code HOTINT, where ac- 
curacy parameters and criteria for the convergence of the numerical differentiation 
were set, such that highest accuracy could be reached for the investigation of the 
convergence. 

Characteristic solutions of both, static and dynamic examples, are shown at the 
end of each section. 



4.1 Static Example Problem 

In the following static example problem, a cantilever beam with a tip load is im- 
posed. The length of the beam is chosen to be L = 2m , the rectangular cross-section 
is given hy h = b = O.Olm and Young's Modulus is £ = 2.07 x lO^^N/m^. The 
tip load is chosen Fq = l.5EI/L^, such that large deformations occur, see Fig. 4. 

The example problem is modeled by means of ANCF and FFRF beam finite ele- 
ments. For comparison, the tip displacement is calculated with an increasing number 
of finite elements. The results are reported for a comparison of convergence of the 
method itself. When convergence is reached the solutions are compared additionally 
with the analytical solution of the extensible Euler elastica as proposed in the work 
of Gerstmayr and Irschik [9]. Following this contribution, the differential equations 
of the cantilever beam as shown in Fig. 4 are derived from the local equilibrium 
equations of a stretched beam element 



Comparison of Planar Elements with Large Deformations 



99 



Fig. 4 Cantilever beam 




^^JS^=-(l+X(x))^cos(e(x)). 



(44) 



where 9 denotes the actual rotation of the beam axis with respect to the undeformed 
beam coordinate. EI includes the bending stiffness and A the axial elongation 
given as 

^0 



X{x) = ^sm{eix)). 
The boundary conditions for the cantilever beam read 



e(x) U=o = and 



de{x) 



dx 



Fq If 
EI 



(45) 



(46) 



meaning that the rotation at the support is zero and the bending moment at the 
support is M = FqIf- The horizontal distance Ip between the force Fq and the 
support in deformed configuration, as shown in Fig. 4, depends on the unknown 
solution for the rotation, but needs to be taken into account for large deformation 
analysis. The distance can be computed by 



/ 



Ip = [{X + l)cos(6l(x)) + 1] dx 



(47) 



and needs to be solved together with Eq. (44). The nonlinear deformation problem 
can be solved for arbitrary precision in symbolic computer programs by means of 
an iterative solution of Ip . Initially, the distance of the force Fq is assumed to be 
Ip = L, and in the ith iteration, the solution of the last iteration 9^'~^\s) is used to 
compute the approximated distance of the applied force, which converges well for 
the investigated problems. To calculate the reference solution, the method has been 
implemented in the symbolic computer code MAPLE, which in general provides 
arbitrary accuracy. For the following example, the tolerance of the Newton method 
has been set to 10~^^, while 22 digits have been used internally for the computation 
of the single expressions in MAPLE. 
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ANCF 



FFRF 



No. elements 


Mx(m) 


uyim) 


ux(m) 


uylm) 


64 


0.2158812656065 


0.8219585119628 


0.2158740 


0.8219675 


128 


0.2158812676722 


0.8219585172209 


0.2158794 


0.8219607 


256 


0.2158812678120 


0.8219585174868 


0.2158808 


0.8219590 


512 


0.2158812678139 


0.8219585174889 


0.2158811 


0.8219586 


Extensible elastica 


0.2158812678134 


0.8219585174888 


0.2158812 


0.8219585 



Table 2 Displacements in 
the cantilever beam 



Elements 



Ux(m) 



ur(m) 



4096 FFRF 
Elastica 



0.2158812660572 
0.2158812678134 



0.8219585119628 
0.8219585174888 



The distances M;f andwy of the endpoint with respect to the(X—, F— )coordinate 
system are shown for an increasing number of beam finite elements in Table 1. 
There, a faster convergence of the ANCF beam finite element, which coincides up to 
12 digits with the solution of the elastica when taking into account 512 beam finite 
elements, is shown. 

The results of the FFRF model show a comparable tendency towards the same 
reference value, given by the solution of the elastica. Therefore an additional calcu- 
lation taking into account 4096 FFRF beam finite elements is performed. The results 
of this calculation are shown in Table 2, where a significant increase of coinciding 
digits is documented. 

The convergence of both formulations is shown in Figs. 5 and 6 by means of the 
relative errors of the X- and 7 -displacements, given for a quantity u and a corre- 
sponding reference value Uref by 



'■iref I 



(48) 



'^ref\ 



where the corresponding solution of the elastica is used as a reference value. On the 
logarithmic scale, the order of convergence for the FFRF elements is approximately 
O(n^), for both the X- and the F -displacements. The ANCF elements converge 
approximately with 0{n^) for a lower number of finite elements, while the conver- 
gence rises significantly for a larger number of finite elements to nearly 0(ii^). 



4.2 Dynamic Example Problems 



In the following section a slider crank mechanism is treated exemplarily for com- 
parison of the ANCF and the FFRF within a dynamic example. The mechanism 
consists of a rigid crank of length r = 0. 1 m, which is driven by a constant circular 
velocity a) = 25rad/s. The flexible slider has length L = 0.5 m, cross-sectional 
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Fig. 5 Relative error of the tip displacements in X -direction versus the number of elements 
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Fig. 6 Relative error of the tip displacements in F-direction versus the number of elements 



area A = hb = 0.004 x 0.004 m^ and gravity g = 9.81 m/s^. In order to provide 
a highly flexible system, Young's modulus is reduced to £ = 2.07 x lO^^N/m'^. 
In case of the FFRF model s = 2 eigenmodes are chosen. The connection rod is 
constrained to the prescribed motion of the crank at the left end, and cannot move in 
vertical direction at the right end, compare Fig. 7. The slider has an initially unde- 
formed straight position and is aligned in the global X-direction together with the 
rigid crank. The initial velocity of the slider is zero. 

The prescribed velocity at the hinge, therefore leads to high (but limited) accel- 
erations, which induce large oscillations. The Z-position of the right end and the 
F-position of the midpoint of the slider are presented in Figs. 8 and 9, respectively. 
The comparison of eight and 64 finite elements shows already a good overall con- 
vergence of the models. 
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sWWW 
Fig. 7 Flexible slider-crank mechanism, deformed configuration 
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Fig. 8 Time evolution of the X-position of the right end of the slider 
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Fig. 9 Time evolution of the Y-position of the mid-point of the slider 
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In order to simplify the comparison of convergence of dynamic examples, one 
characteristic reference value, comparable to the endpoint deflection of a static 
cantilever beam, is provided in the following. This value is supposed to represent 
elastic effects of the studied system. Therefore, the strain energy of the beam finite 
elements. 



U = - I (a^e) dV = -\eA j {u'f dx + EI j {w"f dx , 



(49) 



is used as a basis. In order not to eliminate any shares of the strain energy, due to 

alternating amplitudes, the absolute value is taken into account. Finally, to gain the 

reference value R, the arithmetic mean value of the studied time period t is imposed 

such that 

1 r 

\U\dt. 



R = 



II 



(50) 



The reference values are shown for both, an ANCF and a FFRF model of the slider- 
crank mechanism in Table 3, where a tendency towards a converged solution can be 
assumed. 

This convergence tendency of the problem example with respect to the method 
itself, consequently is documented by means of the error value 



R-R 



errR 



ref\ 



Rref 



(51) 



where Rref is the value of a reference solution given by 256 beam finite elements 
in the present case. The corresponding convergence behavior of the slider-crank 
mechanism is shown in Fig. 10. Again, the ANCF model shows a better convergence 
behavior, but both formulations lead to the same results in the converged case. 



Table 3 Reference value of ANCF and FFRF slider-crank model 



No. elements 


Sancf 


1 


0.020907904853330057 


2 


0.026523065528398459 


4 


0.030813768896922035 


8 


0.033293322837706731 


16 


0.034418309164295127 


32 


0.034880005393381937 


64 


0.035015858976338687 


128 


0.035051064338982495 


256 


0.035060163996004430 



Ra 



0.053579719281086631 
0.053494855207199613 
0.032150639794887304 
0.033255123185251721 
0.033545293772669946 
0.034377494644696992 
0.034875805279761446 
0.035016179848660837 
0.035052237300640429 
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Fig. 10 Error value with 
respect to strain energy; 
slider-crank mechanism 
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5 Conclusions 

In the present contribution, the fundamental different approaches ANCF and FFRF 
for modeling and simulation of multibody systems including large deformations 
have been compared. Although the methods use a completely different discretiza- 
tion, they lead all to the same results in the converged case. In the static case a 
cantilever beam with a tip load was treated, where the endpoint deflection was used 
as a reference value to study the convergence behavior of each method with respect 
to the analytical solution of the extensible Euler elastica. A slider-crank mechanism 
was used exemplarily to study a dynamic example. In order to enable a simpli- 
fied comparison of the results of dynamic examples, a reference value based on the 
strain energy of beam finite elements was introduced. The proposed entity may act 
as a reference value in future studies. 

It is noteworthy that in both examples the ANCF model showed better conver- 
gence properties. 
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Modeling and Analysis of Rigid Multibody 
Systems with Translational Clearance Joints 
Based on the Nonsmooth Dynamics Approach 

Paulo Flores, Remco Leine, and Christoph Glocker 



Abstract The main purpose of this paper is to discuss a method for a dynamic 
modeUng and analysis of multibody systems with translational clearance joints. The 
method is based on the nonsmooth dynamics approach, in which the interaction of 
the elements that constitute a translational clearance joint is modeled with multiple 
frictional unilateral constraints. In the following, the most fundamental issues of the 
nonsmooth dynamics theory are revised. The dynamics of rigid multibody systems 
are stated as an equality of measures, which are formulated at the velocity-impulse 
level. The equations of motion are complemented with constitutive laws for the nor- 
mal and tangential directions. In this work, the unilateral constraints are described 
by a set-valued force law of the type of Signorini's condition, while the frictional 
contacts are characterized by a set-valued force law of the type of Coulomb's law 
for dry friction. The resulting contact-impact problem is formulated and solved as a 
linear complementarity problem, which is embedded in the Moreau's time-stepping 
method. Finally, the classical slider-crank mechanism is considered as a demonstra- 
tive application example and numerical results are presented. The obtained results 
show that the existence of clearance joints in the modeling of multibody systems 
influences their dynamics response. 



1 Introduction 

Manufacturing tolerances, wear and material deformation lead to imperfect joints 
and, therefore, clearances. These clearances modify the dynamic response of 
the system, justify the deviations between the numerical predictions and the 
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experimental measurements and eventually lead to important deviations between 
the projected behavior of the mechanisms and their real outcome. The presence 
of clearance in joints is a complex and important issue in the realistic modeling 
of multibody systems. This aspect gains paramount importance due to the demand 
for the proper design of the real joints in many industrial applications. Over the 
last few years, extensive work has been done to study the dynamic effect of the 
revolute joints with clearance in multibody systems. However, translational joints 
with clearance have received less attention [1-4]. 

Indeed, a number of theoretical and experimental works devoted to the research 
on multibody mechanical systems with realistic joints has been published recently. 
However, most of these works focus on revolute joints with and without lubrication 
effects. An extensive literature review on the issue of modeling and simulation of 
multibody systems with revolute and spherical clearance joints can be found in the 
work by Flores et al. [2]. In contrast to the revolute and spherical clearance joints, 
not much work has been done to model translational joints with clearance because 
in this case several different configurations between the joints elements can take 
place. In fact, the contact configurations of slider and guide include: (i) no contact 
between the two elements; (/() one corner of the slider is in contact with the guide 
surface; (Hi) two adjacent slider corners are in contact with the guide surface, which 
corresponds to have a face of the slider in contact with the guide surface; (/v) two 
opposite slider corners are in contact with the guide surface [5-7]. Moreover, each 
contact point may be in stick or in slip phase, which greatly enlarges the number of 
contact configuration. The conditions for switching from one case to another depend 
on the system's dynamic response. 

Farahanchi and Shaw [8] studied the dynamic response of a planar slider-crank 
mechanism with slider clearance. They demonstrated how complex the system's 
response is, which can be chaotic or periodic. More recently, Thiimmel and Funk 
[9] used the complementarity approach to model impact and friction in a slider- 
crank mechanism with both revolute and translational clearance joints. With the 
purpose to analyze the slider crank mechanism, Wilson and Fawcett [10] derived 
the equations of motion for all different possible configurations of the slider motion 
inside the guide, resulting in a total of 40 equations. They also showed how the 
slider motion in a translational clearance joint depends on the geometry, speed and 
mass distribution. 

Therefore, in the present work, the nonsmooth dynamics approach is used to 
model the type of multibody systems, due to its simplicity and ability to deal with 
all possible different configurations in a unified manner. The methodology is based 
on the nonsmooth dynamics approach, in which the interaction of the colliding 
bodies is modeled with multiple frictional unilateral constraints. The dynamics of 
rigid multibody systems are stated as an equality of measures, which are formu- 
lated at the velocity-impulse level. The equations of motion are complemented with 
constitutive laws for the forces and impulses the normal and tangential directions. 
In this work, the unilateral constraints are described by a set-valued force law of 
the type of Signorini's condition, while the frictional contacts are characterized by 
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a set-valued force law of the type of Coulomb's law for dry friction. The resulting 
contact-impact problem is formulated and solved as a linear complementarity 
problem, which is embedded in the Moreau's time-stepping method 



2 Basic Set- Valued Elements 

A linear complementarity problem (LCP) is a set of linear equations that can be 
written as, [11,12] 

y = Ax + b (1) 

subjected to the inequality complementarity conditions 

y > 0, X > 0, y'^x = (2) 

for which the vectors x and y have to be evaluated for given A and b. In other 
words, the LCP is the problem of finding solutions x e R" and y e M." of (1) and 
(2), where b is an « -dimensional constant column and A is a given square matrix of 
dimension n. The inequality complementarity conditions expressed by Eq. (2) are 
often written in the form 

< y-Lx > (3) 

where y_Lx denotes y^x = 0. An LCP can have a unique solution, multiple solutions 
or no solution at all [13, 14]. All existing solutions can be found using enumerative 
methods, which treat the problem by a combinatorial evolution of the complemen- 
tarity condition x, y, = 0. From the complementarity condition it follows that when 
Xi > 0, then j,- = 0, and vice versa. An LCP of dimension n provides 2" dif- 
ferent combinations of n variables, which are allowed to be greater than zero at 
the same time. For large dimensions, enumerative methods become numerically ex- 
pensive since 2" grows rapidly. A more efficient algorithm is the complementarity 
pivot algorithm, usually referred to as Lemke's algorithm [15-17]. Other efficient 
algorithms to solve LCP can be found in the work by Cottle et al. [12]. 

One of the most important multifunctions (or set-valued maps) related to com- 
plementarity is the unilateral primitive, denoted by Upr. The unilateral primitive is 
a maximal monotone set- valued map on M^ defined as [18, 19] 



Upr(x) := i (-00, 0] x = (4) 



{0} 




X 


>0 


■oo, 


0] 


X 


= 







X 


<0 



The graph of the unilateral primitive map is depicted in Fig. la. Thus, each comple- 
mentarity condition of an LCP can be expressed as one Upr inclusion 

-y e Upr(x) <^ y>0, x>0, xy = (5) 



no 
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X Xl 
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Fig. 1 (a) Map x — > Upr(x); (b) Map x — >■ Sgn(.v); (c) decomposition Sgn(A') into Upr(x) 



Unilateral primitives are used in mechanics at the displacement level and at the 
velocity level to model unilateral geometric and kinematic constraints, such as free 
plays with stops, sprag clutches among others. The associated set-valued force laws 
are conveniently stated as inclusions of (5). 

A second maximal monotone set- valued map, frequently used in complementar- 
ity problems, is the filled-in relay function Sgn-multifunction, which is defined by 
[18,19] 

r {+\} x>Q 

Sgn(A;):= \ [-1, +\] x=0 (6) 

[ {-1} xkQ 

It is important to highlight that, while the classical sgn-function is defined with 
sgn(O) = 0, the Sgn-multifunction is set-valued at x = 0. The graph of the Sgn- 
multifunction is shown in Fig. lb. An inclusion in the Sgn-multifunction can always 
be represented by two inclusions involving the unilateral primitive. The decomposi- 
tion can be written as 



—y e Sgn(x) <^ 3 xr, xl s.t. 



-y G -FUpr(xR) + 1 
-y e -Upr(xL) - 1 

X = Xr- XL 



(7) 



Using Eq. (5), the Eq. (7) can be rewritten in terms of complementarities 



1 + J > 0, XR>Q, (1 -h y)xR = 
-y G Sgn(x) «» 3xr, XL^X. \\-y >Q, xl > 0, (\ - y)xL = 

X = Xr- XL 

(8) 

This representation has to be used when a problem involving Sgn-multifunctions is 
formulated as an LCP in its standard form [20]. 
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3 Set- Valued Force Laws for Frictional Unilateral Contacts 

In the present work, the normal contact between rigid bodies is characterized by a 
set-valued force law called Signorini's condition [21]. Figure 2 shows two convex 
rigid bodies apart from each other by a relative normal gap or distance denoted by 
gN ■ This relative normal gap is uniquely defined for convex surfaces, being per- 
pendicular to the tangent planes at the contact points 1 and 2. The relative normal 
gap is non-negative due to impenetrability condition of the bodies. The two bodies 
in contact with each other when gN = 0. In fact, one of the main features of unilat- 
eral contact is the impenetrability condition, which means that the candidate bodies 
for contact must not cross the boundaries of antagonist bodies. On the other hand, 
the normal contact force Xn is also non-negative because the bodies can not attract 
each other, that is, the constraint is unilateral. The normal contact force vanishes 
when there is no contact, i.e., g^/ > 0, and can only be positive when contact hap- 
pens, that is, gN = 0. Thus, under the assumption of impenetrability between the 
bodies, expressed by g^ > 0, only two situations can occur, namely. 



gN = A Xn > 0, (closed contact) 
gN > A X;^ = 0, (open contact) 



(9) 
(10) 



Equations (9) and (10) represent an inequality complementarity behavior, for which 
the product of the relative normal gap and normal contact force is zero; i.e. 

gNXN=0 (11) 

The relation between the normal gap and normal contact force is described by 

gN > 0, A^ > 0, gNXN = (12) 

which represents the inequality complementarity condition between gN and Xn, the 
so-called Signorini's condition. 



a Tangent contact direction 




Fig. 2 (a) Relative normal gap; (b) normal and tangential contact forces 
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Fig. 3 (a) Signorini's normal 
contact law; (b) Coulomb's 
friction law 
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The inequality complementarity behavior of the normal contact law is depicted 
in Fig. 3a and shows a set-valued graph or a corner of admissible combinations 
between gN and Xn [22]. When two rigid bodies are contacting, the Signorini's 
condition given by Eq. (12) needs to be complemented with an impact law, such 
as the well known Newton's kinematical law that relates the pre- and post-impact 
velocities to the bodies' normal coefficient of restitution, sn- 

The classical Coulomb's friction law is another typical example that can be con- 
sidered as a set- valued force law [18, 23-25]. The magnitude of the static friction 
force is less than or equal to the maximum static friction force which is also propor- 
tional to the normal contact force. Consider again the two contacting rigid bodies 
depicted in Fig. 2, in which Coulomb friction is present at the contact points 1 and 2. 
The relative velocity of point 1 with respect to point 2 along their tangent plane is 
denoted by yr- If contact between the two bodies takes place, i.e. gff = 0, then the 
friction phenomenon imposes a tangential force A 7- as is illustrated in Fig. 2b. If the 
bodies are sliding over each other, the friction force A 7- has the magnitude ixXn and 
acts in the direction opposed to the relative tangential velocity 



-Xt = fiXNSgniyr) yr 7^ 



(13) 



where ji is the friction coefficient and A^v is the normal contact force. If the relative 
tangential velocity vanishes, yr = 0, then the bodies purely roll over each other 
without slip. Pure rolling, or no-slip for locally flat objects, is denoted by stick. If the 
bodies stick, then the friction force must lie in the interval —^iXn < Xt < ^iX^ . 
For unidirectional friction three different scenarios can occur, namely 



yr = => lArl < iiX^ (sticking) 

yr < ^ Xt = +ij^Xn (negative sliding) 

yr > => Ar = —fiX^ (positive sliding) 



(14) 
(15) 
(16) 



These three scenarios can be summarized by a set-valued force law as 

-Xt € iiXNSgniYT) (17) 

Figure 3b shows the Coulomb's friction law as a set-valued force law [18]. 
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4 Dynamics of Nonsmooth Rigid Multibody Systems 

From classical mechanics, it is well known that the Newton-Euler equations of 
motion of a multibody system with / degrees of freedom and with only friction- 
less bilateral constraints can be written as [26] 

Mu-h = (18) 

q = u Vf (19) 

where M = M (q, f) e M.^'^-f is the positive definite and symmetric mass matrix, 
h = h (q, u, t) G ffi-^ represents the vector of all external and gyroscopic forces 
acting on the system forces originating from springs and dampers are also included 
in vector h,q = q(f) e M^ is the /-dimensional vector of generalized coordinates, 
u = u (f ) G M.^ addresses the system generalized velocities and u = u (f ) e M.^ is 
the vector that contains the system accelerations. 

It is clear that Eq. (18) represents a classical second-order differential equation 
that describes the dynamic behavior of a multibody system without any contacts and 
contact forces. Therefore, when a system includes frictional unilateral constraints, 
the occurring contact forces should be taken into account in the equations of motion. 
In general, the magnitudes of the normal and tangential contact forces are added 
to the equations of motion by using the Lagrange multiplier technique [27]. Thus, 
adding the contact forces to Eq. (18), the dynamic equations of motion of a rigid 
multibody system with normal and tangential contact forces can, for non-impulsive 
motion, be written on the acceleration level as [19, 23] 

Mu-h-WN^N -y^T^T =Oa.e. (20) 

q = u Vf (21) 

where W^v = W^v (q. e K-^''" and Wr = Wr (q, e M^""" gather the gener- 
alized normal and tangential force directions wat, and wr, , respectively. The normal 
and tangential contact forces have magnitudes X^t and Xn for each contact point / . 
The dual variables to the normal contact forces X^ are the variations of normal 
gap distances g^v, while the dual variables to the generalized friction or tangential 
forces Xt are the variations of the generalized sliding velocities y^. The remain- 
ing terms of Eq. (20) have the same meaning as described above. It is important 
to note that Eq. (20) requires the existence of the velocities u as well as the exis- 
tence of accelerations u. Motion without impulses implies that XNiO is (locally) 
bounded and time-continuous. The velocities u(f ) therefore exist on non-impulsive 
time-intervals. The friction force Xrit) is discontinuous when a slip-stick transi- 
tion takes place or when the relative sliding velocity of a frictional contact reverses 
its sign. The acceleration u is not defined when Xrit) is discontinuous. The set of 
time instances for which Xrit) is discontinuous is of measure zero and Eq. (20), 
therefore, holds for almost all t . 
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Impulsive motion is described by the impact equation, 

M(u+-u")-WArAAr -WrAr =0 a.e. (22) 

+ ,,, ,. q(f + Af)-q(0 _ q(r + AO-q(f) 

u (f) = lim , u (r) = lim (23) 

A(4.o Ar Arto Af 

which relates the velocity jump to the impulsive forces A^v and Aj- in normal and 
tangential direction respectively. We assume that the velocities u(0 are of locally 
bounded variation (without singular part) and denote u~(?) and u"^(?) as the pre- 
and post-impact velocity respectively. Furthermore, note that finite forces, such as 
gravity or reaction forces from springs and dampers, are non-impulsive, and do not 
occur in Eq. (22). 

Following Moreau [28] we will cast the non-impulsive dynamics (20) and the 
impulsive dynamics (22) in a unified description, by using an equality of measures. 
This constitutes the general framework for nonsmooth rigid multibody dynamics 
[24,29]. 

Multiplying the equation of motion (20) with the Lebesgue measure df and the 
impact equation (22) with the atomic measure drj, being the sum of the Dirac point 
measures at the impact times, yields 

Mudr-hdf-WivAArdr-WrArdf = (24) 

M(u+-u")d?]-W7vAArd??-WrA7'd?? = (25) 

Addition of Eqs. (24) and (25) results in 

M [udr + (u+ - u~) d??]-hdf-WAr {XnAi + AArd?j)-Wr (Aj-dr + Aj-d?]) = 

(26) 

or more briefly, 

Mdu - hd? - WivdPiv - WrdPr = (27) 

The differential measure for the velocities du = ud? -I- (u"*" — u~)d?y consists of the 
Lebesgue measurable part ud/, which accounts for absolutely continuous motion, 
and the atomic parts which accounts for impulsive motion. Hence, for impact free 
motion it holds that du = lidf . Similarly, the measure for the so-called percussions 
corresponds to a Lagrangian multiplier which gathers both finite contact forces A 
and impulsive contact forces A, that is, dP = Xdt + Ad/] [30]. 

In what follows the resolution of the equations of motion expressed in the form of 
the equality of measures (27) is briefly presented and discussed in a review manner. 
The inclusions that are necessary to solve the frictional unilateral contact events in 
an autonomous multibody system, based on the Newton's impact law combined with 
the Coulomb's friction law, are also stated. In addition, the force laws are related 
to the systems' kinematics. The interested reader in the detail description of this 
formulation is referred to the references [18, 28, 3 1]. 
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Since the impenetrability condition between colliding bodies is required, let us 
consider a MBS with n of frictional unilateral constraints, which can be represented 
by n inequalities as, 

gM(q,O>0,/ = 1,...,« (28) 

where the quantities g^j are the normal gap functions of the frictional contacts. 
They are formulated such that, gf^i > indicates an open or positive contact with an 
Euclidian distance of the contact points given by the value of gM, gi^i = corre- 
sponds to a closed or active contact, and g^i < indicates the forbidden overlapping 
or interpenetration between rigid bodies. A rigorous treatment of the definition of 
these inequalities, within the framework of multibody systems formulation, is pre- 
sented and discussed by Pfeiffer and Glocker [32] and Glocker [18]. 
The set of active contacts in the present work is stated as, 

H{t) = {i\gm(<\,t)=0} (29) 

which singles out the contact(s) at which contact-impact forces may occur. 

In order to define the constitutive force laws which relate the contact-impact im- 
pulse measures to the system's kinematics q and u, let us first introduce the normal 
and tangential relative velocities at the contacts as [33] 

YNi = wJ,.U + WNi (30) 

YTi = wljU + WTi (31) 

where wat, and wr, represent the generalized normal and tangential force direc- 
tions, respectively, and wn,- and ivTi are rheonomic terms [18]. 

The equations of motion (27) can now be complemented with constitutive laws 
for normal and tangential contact-impact forces. In the present study, a unilateral 
version of the Newton's impact law is considered for the normal direction with 
local coefficient of restitution s^i e [0, 1]. The Coulomb's friction law is used for 
the tangential direction with coefficient of friction /x, , which is complemented by a 
tangential coefficient of restitution sti e [0, 1]. For the case of a completely elastic 
contact the coefficient of restitution is equal to unity, while for a perfectly inelastic 
contact the coefficient of restitution assumes the value of zero. 

It is important to note that for the Newton's impact law, the impact, which causes 
the sudden change in the relative velocity, is accompanied by a normal contact im- 
pulse dP^r > 0. Suppose that, for any reason, the contact does not participate in the 
impact, that is, that value of the normal contact impulse is zero, although the contact 
is closed. This situation happens normally for multiple contact scenarios. Therefore, 
for this case, we allow the post-impact relative velocity to be higher than the value 
prescribed by Newton's impact law, with the intent to express that the contact is su- 
perfluous and could be removed without changing the contact-impact process. Thus, 
in order to account for these possibilities, two parameters are defined as [33] 

?M ■= Yni + ^NiYm (32) 

b := YTi + ^TiYri (33) 

where {Ym^YTi) '■= iYNi,YTi) (u=^)- 
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Thus, the normal and tangential impact laws can be stated as two inclusions 

-dPiv,- GUpr(^7V,) (34) 

-dPr/ GM,dPw,Sgn(^r/) (35) 

Finally, the complete description of the dynamics of nonsmooth system, which 
accounts for both impact and impact-free phases, is given by Eqs. (27)-(35). This 
problem can be solved by using the Moreau's time-stepping method, which is pre- 
sented and discussed in the next section. 



5 Moreau's Time-Stepping Method 

The time-stepping methods provide a discrete numerical scheme suitable for the 
simulation of nonsmooth systems [30-35]. These methods are widely used due to 
their simplicity to implement and their robustness. The time-stepping schemes are 
based on a time-discretization of the system dynamics. The whole set of discretized 
equations and constraints is used to compute the next state of the motion. Among 
the various time-stepping methods available in the literature, the Moreau's mid- 
point method is one of the most popular and is considered in the present work [28]. 
The equality of measures (27) together with the set-valued force laws (34) and (35) 
form a measure differential inclusion which describes the time-evolution of a multi- 
body system with discontinuities in the generalized velocities, that is, a nonsmooth 
dynamical system. A general way to solve this mathematical problem consists of ap- 
plying the Moreau's time-stepping method, which does not make use of the classical 
equations of motion, which relate the accelerations to forces, but considers the equa- 
tions of motion at the velocity level (27). The first step of the Moreau's approach 
consists of the time-discretization of the measure differential equation. Integrating 
Eq. (27) over a small finite time interval Af, of which initial and end points are 
denoted by the indices A and E, yields the following terms 

/ Mdu K Mm Au = Mm (ue -ua), Mm = M (qm , fjw) (36) 

At 

hdt = Ah ^ hM At , hM = h(qM.u^,?M) (37) 

At 

jWNdPN = VfNMPN, WiVM = Wat (Qm. fw) (38) 

At 



h 



WrdPr = WrMPr. W™ = Wr {({mJm) (39) 

At 
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in which ?m is the midpoint time instant of the compact time interval [Ia , ts ] and 
q^ = q^ + iu^A? is the midpoint system's position state. It is clear that the 
midpoint time instant can be evaluated as 

tM = tA + \^t (40) 

Finally, after the above discretization, the equations of motion expressed at the ve- 
locity level can be written as [20] 

Mm (me - »a) - hM Ar - W^mP^v - W^Pr = (41) 

together with the set- valued contact/impact laws 

-P^ G Upr(^^) 4» -P^ e Nc^ (^n) (42) 

-Pr e MPivSgn (^t) ^ -Pr e A^Cr(Piv) (?r) (43) 

This set of algebraic inclusions can be solved with a linear complementarity prob- 
lem (LCP) formulation or by an augmented Lagrangian approach (ALA) [17]. The 
velocity U£:, at the end of time-step ?£• = f^ -|- A?, is subsequently calculated by 
using Eq. (41). Finally, the positions at the end of the time step are calculated by 

q£ = qjw + l^tME (44) 

Note that Eq. (42) applies only to active set-valued force laws, / e H{t), i.e. set- 
valued force laws that can be described at the velocity level. As friction elements 
are naturally defined at the velocity level, they are always active and can always 
be described by (43). Considering unilateral contacts, Moreau's midpoint algorithm 
calculates the contact distances g^i of all unilateral contacts at the midpoint q^ in 
order to evaluate whether these are active {gNi < 0) or not {gNi > 0)- Only active 
unilateral contacts can be described by inclusion (42). Unilateral contacts that are 
non-active, thus open, are disregarded because it is assumed that their contact force 
contribution is equal to zero. Figure 4 shows the flowchart of the general computa- 
tional strategy, based on the Moreau's time-stepping method, to solve the equations 
of motion for multibody systems with frictional unilateral constraints. 

In what follows, the LCP formulation to solve the contact-impact problem of 
multibody systems with frictional unilateral constraints is presented, which closely 
follows the work by Glocker and Studer [20]. In order to set up the LCP, let us first 
introduce the following matrix notation 



Wmm 


= mat(wAr,(qM,fM))eK-^'' 


, i & H 


(45) 


w™ 


= mat(wr,(qM,fM))eR-^'' 


, i &H 


(46) 


^NM 


= C0l(vi>M(qM,?M)) G M', 


i e H 


(47) 


Wt-m 


= co\ (wTi{<\M,tM)) eK', 


i e H 


(48) 
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Evaluate 

l£ 
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Fig. 4 Flowchart of the computational procedure for the solution of the equations of motion of 
constrained rigid multibody systems with frictional unilateral constraints 



PiV 


= col (PNi) e W , 


/ €H 


Pr 


= col iPn) G M' , 


i e /f 


yNE 


= COl(yw£;)GM'. 


;■ e /f 


Vre 


= col(y7-£/)eM', 


i e // 


VWA 


= col(yA.A/)eM', 


i e // 


yTA 


= col(yrA/)eK', 


/ e/f 


^N 


= colfe)eM', 


i e // 


h 


= colfe)eM'. 


i e H 


EN 


= diag(6M)eM'\ 


i €H 


ET 


= diag(6r;)eK' 


i G H 


V- 


= diag(/x,)eM', 


i G // 



(49) 
(50) 
(51) 
(52) 
(53) 
(54) 
(55) 
(56) 
(57) 
(58) 
(59) 
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Thus, the contact-impact problem of nonsmooth systems can be summarized by the 
following mathematical relations 

Mm (u£ - u^) - hM Af - WnmPn - "WtmPt = (60) 

Yne = '^'nm"e + y/NM (61) 

yTE = ^TM"E + ^TM (62) 

yNA= ^llM^A + WATM (63) 

yTA= ^\m^a + wrM (64) 

^N = Vne + ^nYna (65) 

^T =yTE+^TyTA (66) 

-P^eUpr(§^) (67) 

-PrGliPivSgnC^r) (68) 

The values of y^^ and yj-^ can be evaluated by using Eqs. (63) and (64), respec- 
tively, since the velocities u^ are known at the left endpoint of the time interval. 
Introducing now Eqs. (61) and (52) into Eqs. (65) and (66) yields 

^N = ^llM^E + (^NM + ^NyNA) (69) 

^j. = W^mU£ + (WT-M + SryTA) (70) 

Now, it should be mentioned that the inclusions for the contact-impact force laws 
need to be formulated as complementarity conditions. Thus, the unilateral primitive 
of Eq. (67) results in 

-P^ e Upr(§^) 4» Ptv > 0,^^ > 0,P]^§^ = (71) 

In turn, the relay function (68) have to be decomposed into two Upr's to achieve the 
desired complementarity conditions. Thus, Eq. (68) yields 

-PreiJLP7vSgn(^r) 

r (jlPtv + Pr > 0, ^R>0, i[iPN+PTf^R=0 
<> 3^K, §iS.t. I ^P^ - Pr > 0, §i > 0, ([iPn - Prf U=0 (72) 

in which the step height is [— jxPat, -|-|xP7v]. In addition, to abbreviate the comple- 
mentarity conditions of Eq. (72) the impulsive friction saturations Pr and Pl are 
defined as [18] 

Pr:=[iPi^+Pt, Pr€W (73) 

Pl-V^Pn-Pt^ PleK' (74) 

together with 

^T=^R-U,^R^ U^^' (75) 
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The whole set of complementarity conditions of Eq. (72) can be rewritten as 

0<Ui^U Pi^ >0 (76) 

The reason for the special arrangement of P/, and ^l in Eq. (76), must be sought in 
optimization theory. Without this special arrangement, one is not able to be set up 
the LCP formulation without additional matrix inversion processes [18]. Since the 
variables ^r, Pr and U£ are not included in (28), they have to be eliminated. Thus, 
combining Eqs. (60) and (73), yields 

Mm (u£ - u^) - hM Af - {Wnm - WrMl^) Pn - WtmPr = (77) 
Substituting now Eq. (75) into Eq. (70) results in 

^R=^TM^E +{y^TM + eTyTA) + ^L (78) 

The elimination of variable Ft can be done through the combination of Eqs. (73) 
and (74), which can be written as 

Pl=2ji/'^-P« (79) 

Since the inversion of mass matrix M is always possible, Eq. (77) can be solved 
foru£ 

U£ = u^ + M^^hM Ar + M^i {Wnm - WrMlJi.) P^v + M^^WthPr (80) 

Now, Eqs. (63) and (64) are used to express W]^^u^ and W]^^u^ in terms of y^^ 
and yr^ 

^nm"a = Vna - ^NM (81) 

W^j^fUA =Yta- ^tm (82) 

Introducing Eqs. (80)-(82) into Eqs. (69) and (78), yields 

^N = W^NM^M^M^t + wI^Mm^ (Wnm - Wtm[i-)Pn 

+ WImMjIWtmPr + (I + bn) Yna (83) 

^R = W]^^M^ihM Af + WIm^m' ("Wnm - WrMjJi) P^v 

+ WT^M^iWrMP« + (I + er)yr^+§L (84) 

Thus, Eqs. (83), (84) and (79) can be written in a matrix form as 
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= W^j^M^i (Wnm-WtmIi) W^^M^iW™ I 
V 2ii -I 

{^Im^m^m^i + (I + Bn) yNA\ 
+ WT^M^ihMAr + (I + eT)yTA (85) 

Equations (85) together with the conditions (76) form the LCP for the contact-impact 
analysis of multibody systems with frictional unilateral constraints. The LCP (85) 
is solved in each integration time step. Then, the velocities u^ and positions q^ for 
the subsequent time steps are obtained from Eqs. (80) and (44). 

Figure 5 illustrates the Moreau's time-stepping method with an LCP formulation 
developed under the framework of MBS formulation. 



4 Demonstrative Application to a Slider-Crank Mechanism 

This section deals with the dynamic modeling and analysis of a planar slider-crank 
mechanism with a translation clearance joint. This multibody mechanical system 
consists of four rigid bodies, which represent the ground, the crank, the connecting 
rod and the slider. The body numbers and their center of mass are shown in Fig. 6. 
The ground, the crank, the connecting rod and the slider are constrained via ideal 
revolute joints. The center of mass of each body is considered to be located at the 
mid distance of the bodies' total length. The translational clearance joint is com- 
posed by a guide and a slider. This joint has a finite clearance, which is constant 
along the length of the slider. 

Figure? shows a translational clearance joint. The clearance c is defined as the 
difference between the distance of the guide and the slider surfaces. The geometric 
characteristics of the translational clearance joint are the slider length 2a, the slider 
width 2b, and the distance between the guide surfaces (i. In an ideal translational 
joint the two bodies translate with respect to each other parallel to the line of trans- 
lation, so that, there is neither rotation between the bodies nor a relative translation 
motion in the direction perpendicular to the axis of the joint. The existence of a 
clearance in a translational joint introduces two extra degrees of freedom. Hence, 
the slider can move 'freely' inside the guide limits, until it reaches the guide sur- 
faces. 

The modeling of translational clearance joints is a complex task, due to the sev- 
eral possible contact configurations between the slider and guide. Figure 8 illustrates 
four different scenarios for the slider configuration relative to guide surface, namely: 
(i) No contact between the two elements: the slider is in free flight motion inside the 
guide; (ii) one corner of the slider is in contact with the guide surface; (iii) two ad- 
jacent slider corners are in contact with the guide surface, which implies that a face 
of slider is in contact with the guide surface; (iv) two opposite slider corners are in 
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contact with the guide surface. The conditions for switching from one case to an- 
other depend on the system's dynamic response as well as on the material colliding 
properties. 

In order for the translational clearance joint to be simulated in the multibody 
system environment, is it first required that the system's equations of motion be 
derived. In this work the Lagrange's equation of second type is used and it can be 
written as [36] 

d /'dL\ dL 

57 (d- Si; ="'' = ' ^ ""' 



Fig. 5 Flowchart of the 
Moreau's time-stepping 
algorithm with an LCP 
formulation 
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where L is the Lagrangian of the system, that is, the difference between kinetic and 
potential energies, expressed in terms of the generalized coordinates and their time 
derivatives. 

Since the slider-crank mechanism represented in Fig. 6 has three degrees of free- 
dom, three is also the number of generalized coordinates that uniquely represent the 
system's configuration. Furthermore, the crank, the connecting rod and the slider 
have masses m,- and moments of inertia with respect to the principal central axes 
perpendicular to the plane of motion /,-, where ;' = 1,2 and 3. Thus, the vector of 
generalized coordinates and velocities are defined as 



(87) 



u = I ft^2 I • with q = u a.e. 



(88) 



Thus, applying the Lagrange's equation to slider-crank mechanism yields [37] 



<Mn Mi2 Mi3\ fdA /hi' 

M21 M22 M23 6*2 = Mi2 

^Mai M32 M33/ V^'s/ \^3/ 



(89) 



in which 



■//////// 




777777777777777777' 



Fig. 6 Slider-crank mechanism with a translational clearance joint 
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Fig. 7 Translational joint 
with clearance ttiat is, the 
slider and guide 
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Fig. 8 Diiferent scenarios 
for the slider and guide 
interaction: (a) no contact; 
(b) one comer in contact with 
the guide; (c) two adjacent 
corners in contact with guide; 
(d) two opposite corners in 
contact with guide 
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M-ix = /i + I -mx + OT2 + '"3 I l\ 

Mi2 = M21 = I -m2 + nii J hh cos (6*2 - ^1) 
Mi3 = M31 = M23 = M32 = 



(90) 

(91) 

(92) 

(93) 

(94) 

hi = I -ni2 + OT3I/1/2 sin (02 -6*1) 611- ( -mi + m2 + mA ghco^Oi (95) 



( -m2 + OT3 1 



M22 = J2 + [ t'M2 + mj,]ll 



M 



= ( x'^z + OT3 j 



33 = ^3 

2 



h2 = -[ -m2 + OT3 I /i^2 sin (62 -0i)9f -I -m2 + rm \ gh cos 62 (96) 



/!3=0 



(97) 



In order to determine the gap functions let us consider Fig. 9 where a generic po- 
sition of the slider inside the guide is illustrated with the purpose to represent the 
closed kinematic chain of each potential contact point. 
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From analysis of Fig. 9 and considering the system kinematics, the mathematical 
expressions of the gap functions can be written as [37] 



gTl 



d 
= /i sin 9i — I2 sin 62 + a sin 63 — b cos 6*3 

= /i cos 6*1 + I2 cos 62 — a cos 63 — b sin 9^ 

d 
g]\f2 = h sin Oy — I2 sin 62 — a sin 63 — b cos 9^ 

gT2 = h cos 6*1 + I2 cos 62 + a cos 63 —b sin 6*3 

g-7V3 = h /i sin 6\ + I2 sin O2 — a sin 63 —b cos ^3 

^73 = /i cos 01 + I2 cos 02 — a cos 63 + b sin 63 

g"Ar4 = h /i sin 6*1 + I2 sin (^2 + cf sin 63 — b cos f?3 

^7-4 = h cos 01 + /2 cos 62 + a cos f?3 + Z? sin 63 



Then, the w vectors and of the vv scalars associated with each contact point 
obtained as 



dgNi 

WATi = — - 

9q 



dgTi 
wn = —: — - 
3q 



dgN2 

9q 



— Zi cos^i 

-I2 cos 02 

ya cos 63 + b sin ^3^ 

— /i sin^i 

-I2 sin ^2 

Va sin 63 —b cos 6*3^ 

— /i cos 6*1 

-h cos 6*2 

V— a cos 93 + b sin ^3^ 



(98) 
(99) 

(100) 
(101) 
(102) 
(103) 
(104) 
(105) 

can be 
(106) 

(107) 
(108) 
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Fig. 9 Generic position of 
the slider inside the guide 
where the distance between 
guide upper and lower 
surfaces is exaggerated for 
illustration purpose 
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-I, Sine, \ 
wr2 = ^= -/2smf?2 (109) 



dq 



y—a sin t)j, — b cos a-i/ 



WiV3 = 4^= /2COS02 (110) 



9q 



-a cos 03 + b sin 6*3 y 
— /i sin^i 



wr3 = ^=| -/2sin02 I (111) 



dq 



dgN4 



ya sin 63 + h cos 62^ 



/i cos 6* 



WiV4 = -^ — =1 /2 cos 6*2 I (112) 

\a cos 63 + b sin ^3/ 

wr4 = ^= -/2sinf?2 (113) 

\— a sin 63 + b cos f?3/ 

VVATi = WTl = WM2 = WT2 = WM3 = WT3 = WN4 = WJA = (114) 

The geometrical characteristics, the inertial properties, the force elements, 
the contact parameters and the initial conditions necessary to perform the dy- 
namic analysis of the slider-crank mechanism with a translational clearance joint 
are listed in Table 1 . 

Figure 10 shows the corners motion in a dimensionless form for two full crank 
rotations, in which the free slider motion and contact-impact events can be observed. 
Figure 1 1 illustrates the crank speed, the connecting-rod speed and the portraits rel- 
ative to connecting-rod and slider for two complete crank rotations. 

The dimensionless slider trajectories are shown in Fig. 10, where the different 
types of motion between the slider and guide observed are associated with the 
different guide-slider configurations, i.e., no contact, impact followed by rebound 
and permanent contact between the joint elements. The effects of impact between 
the slider and guide surfaces are also quite visible in the plots of Fig. 1 lb and c, 
namely, one can observe the discontinuities in the connecting-rod speed. On the 
other hand, the smooth changes in the speed indicate that the slider and guide sur- 
faces are in permanent contact for long periods, as it is illustrated in the slider 
portrait of Fig. 1 Id. 

It should be highlighted that some numerical difficulties can arise when the clear- 
ance size is very small, which will lead to the well known drift problem. In these 
situations, one possible way to overcome those difficulties consists of a projection 
technique, in which the excessive penetration between the slider and guide surfaces 
is eliminated in each time step in order to avoid the further interpretation of the 
bodies. When this scheme is implemented, special attention should be paid to the 
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Fig. 10 Dimensionless motion of the slider comers, (a) Comer 1; (b) comer 2; (c) comer 3; 
(d) comer 4 
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Fig. 11 (a) Crank speed; (b) connecting-rod speed; (c) connecting-rod portrait; (d) slider portrait 
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Table 1 Parameters used in the dynamic simulation of the 
slider-crank mechanism 



Geometrical characteristics Ij = 0.1530m 

h = 0.3060 m 
a = 0.0500 m 
b = 0.0250 m 
c = 0.0010 m 

Inertial properties mi = m2 = 0.0380kg 

mj = 0.0760 kg 
J, = 7.4 X 10^'kgm^ 
J2 = 5.9x 10-''kgm^ 
J3 = 2.7 X lO^^kgm^ 

Force elements g = 9.81 m/s- 

Contact parameters Eni = en2 = en3 = en4 = 0.4 

E'Y] ^ £x2 ^ ^T3 ^ ^T4 ^ 0.0 
J^l = JJL2 = M-3 = M-4 = 0.01 

Initial conditions 0|o = O20 = 630 = 0.0 rad 

ooio = 150.0rad/s 
CO20 = — 75.0rad/s 
CO30 = O.Orad/s 



conservation of the systems energy, since it can lead to overestimated total system 
energy associated with the contact-impact phenomena. 



5 Conclusions 

A comprehensive investigation of contact-impact analysis in multibody systems 
based on the nonsmooth dynamics approach was presented in this work. The 
methodology was based on the nonsmooth dynamics approach, in which the interac- 
tion of the colliding bodies is modeled with multiple frictional unilateral constraints. 
The dynamics of rigid multibody systems were stated as an equality of measures, 
which were formulated at the velocity-impulse level. The equations of motion were 
complemented with constitutive laws for the forces and impulses in normal and 
tangential directions. The formulation of the generalized contact-impact kinemat- 
ics in the normal and tangential directions was performed by obtaining a geometric 
relation for the gaps of the candidate contact points. The gaps were expressed as 
functions of the generalized coordinates. The candidate contact points were mod- 
eled as hard contacts, being the normal and tangential contact laws formulated as 
set-valued force laws for frictional unilateral constraints. 

In this work, the unilateral constraints were described by a set-valued force law 
of the type of Signorini's condition, while the frictional contacts were characterized 
by a set- valued force law of the type of Coulomb's law for dry friction. The result- 
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ing contact-impact problem was formulated and solved as a linear complementarity 
problem and with the augmented Lagrangian approach, which were embedded in the 
Moreau's time-stepping method. Finally, the effectiveness of the presented method- 
ologies was demonstrated through the study of the slider crank mechanism with 
a translational clearance joint. The main results obtained from this research work 
showed that the effect of the contact-impact phenomena can have a predictable non- 
linear behavior. This nonlinearity aspect is more evident when the system includes 
friction phenomena. With the knowledge of nonlinearities in multibody systems, 
chaotic behavior may be eliminated with suitable design and/or parameter changes 
of a mechanical system. This feature plays a crucial role in the dynamics, design 
and control of general multibody systems of common application. 
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Application of General Multibody Methods 
to Robotics 

Janusz Fr^czek and Marek Wojtyra 



Abstract In this chapter robotic applications of general multibody system (MBS) 
simulation methods, based on absolute coordinates formalism, are presented. Three 
typical problems, often encountered in robotics, are discussed: kinematic analysis 
with singular configuration detection, simulation of parallel robot dynamics inves- 
tigated jointly with the robot control systems properties, and finally, simulation of 
a robot with flexibility effects taken into account. In case of singular configuration 
detection simplest types of singular configurations are analyzed - turning point and 
bifurcation point. The second case of MBS application is an example of parallel 
robot dynamic analysis when model based control is taken into account. The last 
part of the chapter is devoted to the analysis of complex, flexible power transmis- 
sion mechanism carried out with general MBS formalism. 



1 Introduction 

Many modern robots are constructed in such a way that they have relatively simple 
kinematic structure and, as a result, solutions to direct and inverse kinematics prob- 
lems, formulated usually in joint coordinates, are known in closed (symbolic) form. 
Moreover, differential equations of robot motion may also be formulated in a prob- 
lem specific way. Most often algorithms of kinematic or dynamic calculations, 
which are used by robot controller, are tailored to fit the particular robot. Computer 
code written especially for specific mechanism structure is simpler and more effi- 
cient than a general program. Thus, usually there is no need for general multibody 
methods application in robot everyday operation. 

General multibody methods can be very useful, however, during the robot de- 
sign process. It is convenient to use these methods, especially in case of analysis or 
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synthesis of robot with complicated structure (with many closed kinematic loops), 
since they do not require closed form kinematics solution. Moreover, kinematic and 
dynamic equations are formulated automatically and no tedious derivation of for- 
mulas is required. The general multibody methods can be used, e.g., for kinematic 
structure synthesis and analysis, for virtual tests of control system, as well as for 
calculating loads and stresses in robot elements. 

It is worth noting that quite often general and commonly used multibody meth- 
ods are based on absolute coordinates formalism. This formalism is not popular in 
robotics, since joint coordinates are most commonly used in that field. 

In this chapter robotic applications of general multibody simulation methods, 
based on absolute coordinates formalism (which is commonly used in practical cal- 
culations and in popular software packages) are presented. Three typical problems, 
often encountered in robotics, are discussed: kinematic analysis with singular con- 
figurations detection, simulation of a parallel robot dynamics investigated jointly 
with the robot control systems properties, and finally, simulation of a robot with 
flexibility effects taken into account. 



2 Absolute Coordinates Approach to Robot Kinematic Analysis 
and Singular Configuration Detection 

Singularities of manipulators are a widely recognized issue in robotics. Robot 
singularities usually mean kinematic singularities where the number of robot de- 
grees of freedom locally changes. The singularities were intensively studied in 
robotics, e.g. [1-3], but the classification and global description of singular con- 
figurations exists only for special cases. Vast majority of mentioned algorithms is 
devoted to singularity analysis of robots and manipulators with known, relatively 
simple structure and with small number of kinematic loops. Many authors investi- 
gated special problems of singularities in case of analysis of parallel manipulators. 
Singularities of parallel manipulators were classified in terms of kinematic input- 
output relations and this classification is commonly applied [4]. 

The singular configuration detection is also an important topic in general multi- 
body simulations. Unfortunately, the ideas and methods of singularity detection 
and classification developed in robotics cannot be directly transferred to topology- 
independent kinematic analysis of general, complicated multibody systems ana- 
lyzed in absolute coordinates. It should be emphasized that number of papers 
devoted to singularity detection and analysis in case of kinematic simulation of gen- 
eral multibody system is limited. 

In this section we will show an attempt of novel application of numerical con- 
tinuation algorithm [5] to kinematic analysis of a robotic system. This algorithm 
uses local coordinate parameterization instead of time parameterization which is 
commonly used in multibody system simulation. The algorithm is suitable for 
kinematic analysis of a general multibody system described in absolute coordi- 
nates. The simplest cases of singularities may be detected using mentioned ideas. 
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We will illustrate developed methods on the example of preliminary kinematic 
studies of a complicated robotic construction [6] in both singularity-free and sin- 
gular configurations. 



2.1 Theoretical Background 

The algorithm of kinematic analysis in absolute coordinates is usually based on the 
trajectory tracing using time as the independent parameter and the classical Newton 
iteration scheme [7] . In case of analysis of complicated mechanism, simulation usu- 
ally fails in singular positions and the reason of that cannot be easily detected by the 
user. This creates needs for singularities detection and analysis. Moreover, singular 
positions analysis can provide interesting information, relevant for the mechanism 
structural synthesis and control system synthesis. 

The collection of all constraints induced by the joints present in MBS is de- 
noted by: 

$'^(q,r) = 0, (1) 

where q is the vector of absolute generalised coordinates [7]. 

The number of scalar equations in (1) is equal to / and the number of general- 
ized coordinates is equal to A'' =6m, where m is the number of rigid bodies (three 
variables for rotation parameterization are used). Typically, / is less than A^. 

Motion is defined by a time dependent constraint equation (driving constraints): 

$'^(q,0 = 0. (2) 

In the most general case, when constraint equations are induced by either joints or 
driving constraints, the following equations must be satisfied at any time f (position 
level analysis): 



■"^^Cq.O 



^'^(q,0 



$(q,?) = 

and (velocity and acceleration level analysis, respectively): 
d^(q,t) 



= (3) 



dt 

J2$ 



«&qq + «I>r = 0, (4) 

<tqq + (4>qq)qq + 2<S>t^q + <i>t, = 4>qq - T = 0. (5) 



dt^ 

If constraints (3) are independent in point qo = [qg toV (regular point), i.e. if: 

rank ($q) = A^, (6) 

then unique solutions of linear systems (4) and (5), as well as nonlinear system (3), 
exist in the neighbourhood of point qo = [qj^ to]^ ■ From numerical point of view. 
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the kinematic analysis of a system described by (3) can be considered as a numerical 
tracing of a trajectory T such that: 

r = {q = [q^.f]^:4>(q) = 0,q = q(f), ?o < f < ?i, Q e ^^+'} . (7) 

In general, the numerical tracing of trajectories (7) is the subject of the numerical 
continuation methods [5]. One of the simplest methods is the Euler predictor- 
Newton corrector scheme. The classical Newton corrector can be defined in the 
form: 

$q(q^, f' + ^)/lq'' + $(q^, f' + i) = 0. (8) 

The iterative algorithm (8) is numerically very efficient, however, under strong 
assumption that condition (6) is fulfilled (i.e., all points of the trajectory are regular) 
and when good starting position is chosen. In case the trajectory contains singular 
points which correspond to robot singular positions, Eq. (8) becomes ill-conditioned 
and simulation usually fails. 

We will distinguish two simplest cases of singular configuration of MBS [8]: 

1. Turning point (limit point or fold bifurcation) - 'when rank{'^q) = N — l and 
rank{^~) = N and there exists a parameterization q(T), f (t), with q(To) = qo 

and f (to) = to, and d^t / dx^ ^ 0. 

2. Simple stationary bifurcation point - when rank{^ (^) = rank(<^~-) = N — l and 
exactly two branches of solutions intersect with two distinct tangents. 

In both cases point qo = [q^ ?o V fulfils Eq. (7). 

The geometrical representations of these two singular configurations are shown 
in Fig. 1 on the example of a slider-crank mechanism. 

In order to numerically detect and describe two simple cases of singular configu- 
ration, general continuation scheme was applied [5]. The Newton corrector (8) was 
replaced with a more general corrector given by: 

,j^-\-l _l^ ,^jt ^k .—0 '/ + 1 

q =q -$±(q )$(q ) and q =q , (9) 

q 

where (.)"'" denotes pseudo-inverse matrix (Moore-Penrose). 

Pseudo-inverse matrix can be calculated efficiently using sparse matrix J~^ given 
by the formula: 



5 = /-e'«&-^4.,. (10) 

In order to trace trajectory (branch) numerically, local parameterization strategy 
was chosen [5]. Instead of the time variable t, other parameter is chosen as the 
independent one. Symbol x in formula (10) is responsible for the parameter choice. 
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.V coordinate of the slider point C [m] 




-?///////// 



Fig. 1 Slider crank mechanism (planar example). Driving constraints are determined by equation: 
(pi—cot = 0: (a) AB/ BC > 1 turning point (lock-up position), (b) AC = BC (simple stationary 
bifurcation point). In both cases singular position occurs when link BC reaches vertical position 

The independent parameters are constant in the intervals of time. Moore-Penrose 
matrix can be calculated efficiently with the formula: 



$i = (I-ss^)(J"i),v, 
q 



(11) 



where s is a tangent vector and (J~^)Ar denotes the submatrix of matrix (10) built 
of the first A^ columns. It should be pointed out that, in the algorithm of matrix J~^ 
evaluation, sparsity of the matrices can be exploited intensively using, e.g., well 
known subroutines for sparse matrices. 

For bifurcation point detection a test function ^ is introduced, which is evaluated 
during the branch tracing. A bifurcation is indicated by function ^ value - at the 
bifurcation point the test function satisfies condition: ^ = 0. For the branch trac- 
ing given by iterative scheme (10), the test function can be proposed in the form: 

^ = det ( ~3 )■ This expression can be evaluated very efficiently if sparsity of the 

Jacobian matrix $q is taken into account. 
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In the close neighbourhood of the bifurcation point singular position can be 
evaluated with greater accuracy using direct method for branch calculating. The 
equation set can be extended to the new branch system: 



«&(Y) 



^(q,0 " 




4»q(q,0h 


= 0, Y=[q^fh^ 


hk-l . 





]^ (12) 



where h is a tangent vector. 

System (12) can be solved using efficient numerical solvers for sparse matrix 
equation and Newton-like iterative schemes. 

It should be pointed out that presented algorithms can be used for branch switch- 
ing, i.e., calculating one (at least) solution on the emanating branch. If one of the 
solutions is situated somewhat close to the bifurcation point, then other solution on 
the emanating branch can be found using techniques of perturbations, widely used 
in numerical continuation theory. 

A general numerical test program was developed using the described above gen- 
eral techniques for branch tracing and switching. Its idea was based on the research 
package named BIFPACK [8]. Numerical algorithm used in the test program can be 
implemented also in general environment of multibody codes, as a separate module 
devoted to branch tracing and simple singularity diagnosis in kinematic analyses. 



2.2 Robotic Example 

Practical applications of developed ideas can be demonstrated on the example of a 
multilink robot [9], which was primarily intended to weld car body in places which 
are not easy to reach. The robot consists of several sets of bodies, called segments 
(Fig. 2b). Segments are built of rigid parts. Kinematic scheme of an exemplary seg- 
ment is presented in Fig. 2b. Every segment consists of A + In rigid parts connected 
by spherical-translational and revolute joints. The segment mobility (Grubler count) 
does not depend on number n and is always equal to 2. Kinematic constraints im- 
posed on bodies are independent (there are no redundant constraints). Two, three 
or four (generally m) segments with different or equal number of bodies can be 
connected, giving a manipulator with Grubler count equal to 4, 6 and 8 {2m), re- 
spectively. In Fig. 2c a multilink robot built of three segments, with ten bodies each, 
is presented. 

Various analyses were preformed during the dimensional and structural synthesis 
process, and the following problems were investigated: 

• Deciding whether the desired trajectory can be realised for given structures and 
dimensions of the robot (detection of lock-up positions) 

• Detection of singular positions of the robot, and particularly positions where 
kinematic parameters of robot links are not continuous functions of time 
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Fig. 2 Multilink robot: (a) One segment of consisting of 12 rigid parts, (b) kinematic scheme of a 
six-part segment, (c) robot built of three 10-part segments 
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Fig. 3 Global coordinates (x, z) of the multilink robot top part centre 



• Determining the relative angles in the actuated revolute joints as functions of 
time (inverse kinematics) 

The robot kinematics was simulated using a model built in a general multibody 
package. Absolute coordinate formulation was used and test functions were imple- 
mented. 

In Fig. 3 one of the simulation results is shown. The turning point is detected, 
which can be geometrically interpreted as a lock-up position of the mechanism. Full 
analysis of robot behaviour in the neighbourhood of turning point was possible, 
since the branch tracing program was used. 
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3 Simulation Study of Stewart Platform 
with Model-Based Control 



Robot is a complex system consisting of manipulator mechanisms, actuators, sen- 
sors and control unit electronics. Due to the system complexity it is usually difficult 
to predict newly designed robot behaviour without numerical simulation. Thus 
building a numerical model quite often becomes necessary to provide the robot de- 
signers with required data. Each subsystem of the robot can be analyzed separately, 
however, the best results are obtained when all subsystems are modelled and simu- 
lated jointly. The general multibody methods and software can be relatively easily 
used to model the robot mechanisms, whereas other robot subsystems usually need 
different methods and software to be simulated [10, 1 1]. The multibody methods and 
models are versatile and thus they can be linked with other methods, which provide 
a good platform for integration of various scientific and engineering disciplines. 

In this section a simulation study of parallel robot and its control system is 
presented. Two well-known software packages, one designed to perform control 
system simulation and the other dedicated to multibody simulation, have been used 
to conduct the study. These packages were cooperating during simulations - both 
programs performed all calculations simultaneously. 



3.1 Methods 

In contemporary parallel robots the position control - widely used in earlier de- 
signs - is replaced by a model-based control. Due to high sampling frequency of 
control systems, the driving forces calculations must be performed fast. Therefore 
the inverse dynamics model of manipulator is usually simplified, and thus some 
aspects of manipulator motion are not represented in their full complexity. 

A 6-dof Stewart platform type parallel manipulator was investigated. The inves- 
tigated control scheme employed the dynamics of platform and actuators. Friction 
was included in the model. The aim of the study was to check what is the influence 
of dynamics model accuracy on the quality of control process. The simplifications 
of the model and the problems with finding accurate values of its parameters (es- 
pecially friction parameters) were considered. The results of this study helped to 
decide how big could be the simplifications of the dynamics model and to predict 
what are the possible results of inaccurate determination of the crucial model pa- 
rameters. The obtained results were also helpful when controller parameters were 
searched for. 

The manipulator forward dynamics was modelled using a multibody package, 
which automatically generates and solves the equations of motion. Therefore it was 
relatively easy to introduce changes into the model and to take into account various 
factors, for example joint friction or interactions with environment. There was no 
need for tedious and difficult process of deriving and programming manually the 
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Fig. 4 Schematic view of the model 



necessary equations. Therefore, crude simplifications of the multibody model were 
not required. The additional benefit of using multibody package was the possibility 
to create and watch animations of the manipulator in motion. 

The control system and the electric actuators were modelled in the control soft- 
ware simulation package. Since the model-based control scheme was adopted, the 
inverse dynamics problem had to be solved within the control system. Friction 
effects were included in the inverse dynamics model. In order to enable fast cal- 
culations, the model utilized by the control system was simplified. 

The model scheme is shown in Fig. 4. It is worth noting that electric phenomena 
in actuators were included in the control package part of the model, whereas the me- 
chanics of actuators (inertial properties, gearing etc.) was modelled in the multibody 
package. 



3.1.1 Manipulator Kinematics 

A kinematic scheme of the manipulator is presented in Fig. 5. To simplify the pic- 
ture, details of only one leg are presented. The coordinates of position vectors 
djij = 1, . . ., 6) are constant in the jto frame (established on the manipulator ba- 
sis), whereas coordinates of position vectors s^ (7 = 1 , .... 6) are constant in the 
Ki frame (established on moving platform). 

The kinematics of moving platform is described by absolute coordinates. The 
position of local frame jti in the global frame ito is described by vector r, and the 
orientation of n 1 frame with respect to jto frame is given by three Euler (z — x' — z") 
angles: cpi, (p2, cp^- The coordinates of vector r and angles cpi, cpj, ^3 are assumed 
functions of time. 

The inverse kinematics problem consists in searching for the actuators motion 
(lengths, velocities and accelerations) when the platform motion is given (position, 
velocity, acceleration). Since in the inverse kinematics problem position vector r and 
direction cosines matrix R are given, vector 1/ from point Aj to point Bbj, distance 
Ij and unit vector u/ (see Fig. 5) can be calculated as: 



\j =r + Rs^^ 



'./' 



li 



u/ =l//0- 



(13) 
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Fig. 5 Simplified kinematic 
scheme of the manipulator 




Differentiation of the above formulas leads to the velocity equation: 



ij = ujiy = uj (v + WSj) = ujv - uJsyM = J; 



In the above equation Jy denotes 7 th row of manipulator Jacobian matrix: 



J^ = ["J-»J^^] 



(14) 



(15) 



V and (o are linear and angular velocities, respectively, and symbol w denotes the 
skew-symmetric matrix associated with vector m. 

Similarly, in the case of inverse kinematics, the acceleration equation can be for- 
mulated in a closed form [12]. Finally, actuator lengths, velocities and accelerations 
can be grouped into six-component- vectors: 



[h-'-kf, L = [ii...ief. L=[l\...kf. 



(16) 



The direct kinematics problem consists in searching for the platform motion (posi- 
tion, velocity, acceleration) when the actuators motion is given (lengths, velocities 
and accelerations). Equations (13) can be reformulated to obtain: 



; (r, q,) ^ (r + R {ip) sf - d,)^ (r + K ((p) s^^ - d;) - /J = 0, (17) 







where (p = [i^i i/^i ^3 ] • 

The above equation can be formulated for each of the six legs, thus a set of six 
nonlinear algebraic equations (r and (p are the unknown parameters) is obtained. 
During direct kinematics calculations the equations are solved using the iterative 
Newton-Raphson method. Several solutions can be found, however, we are inter- 
ested only in this one which corresponds to the admissible configuration of the 
manipulator. That is why the initial guess q'* should be chosen carefully. It was 
found that good results are obtained when the iterations start from point q° which 
represents the central point of the manipulator workspace [12]. Some numerical 
tests have proven that iterations converge to the proper solution. 
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Calculation of unknown position (r and (p) is followed by calculation of velocity 
(v and w) and then - acceleration (a and e). Linear equations set are solved in the 
case of velocity and acceleration problems. 



3.1.2 Manipulator Dynamics 

The forward dynamics problem consists in searching for the mechanism motion, 
when forces actuating the mechanism are known. The forward dynamics of manipu- 
lator was modelled using multibody package. This program automatically generates 
and solves multibody system equations of motion. Thus, there is no need to derive 
the motion equations in a full (i.e. not simplified) form. 

The inverse problem of dynamics consists in searching for driving forces, which 
are necessary to obtain the desired motion of mechanism. The manipulator control 
system employs the simplified inverse dynamics model. To simplify calculations, it 
was assumed that all the parts of mechanism, except for the moving platform, are 
massless. Moreover, friction in joints was neglected (the only exception was friction 
in the linear actuators, which is described in the subsequent text). 

It was assumed that platform centre of mass coincides with the origin of Ki 
frame. The platform is characterized by mass m and inertia matrix I^^^ with con- 
stant elements calculated with respect to the local (i.e., moving with the platform) 
frame iti. The inertia properties calculated with respect to the centre of mass and 
axes parallel to the global frame jto are not constant and depend on the platform 
instantaneous orientation. The inertia matrix (with respect to axes parallel to jto 
frame) can be calculated using the following equation: 

I = RI^^^R^. (18) 

The Newton law relates the total force acting on the platform with the platform 
mass and center of mass linear acceleration: 

F = ma. (19) 

The Euler equation relates the total torque about the platform centre of mass with 
the platform angular velocity, acceleration and inertia matrix: 

M = Ie-F wIm. (20) 

For assumed platform motion, force F and torque M can be calculated directly 
from Eqs. (19) and (20). 

The manipulator Jacobian matrix J relates forces Pj developed by actuators to 
total force F and total torque M [13, 14]: 

m]=-J^P- ^ = [P^---Pef- (21) 
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To solve the (simplified) inverse problem of dynamics it is sufficient to perform 
calculations according to Eqs. (18)-(20) and then to solve the set of linear 
equations (21). 



3.1.3 Friction in Actuators 

The moving platform is driven by six linear actuators. Ball-screw mechanism trans- 
forms rotational motion of a DC motor into linear motion of the actuator, thus 
force-torque relation is the following: 

P^ = Tj/h, ill) 

where P^ is the nominal force developed by actuator, Tj is the resultant elec- 
tromagnetic torque applied to DC motor armature and /; is a constant value that 
depends on actuator ball-screw pith of thread and on motor gearing ratio. 

The actuator output force Pj differs from nominal force P^ . The difference is 
caused by the presence of friction effects in actuator. The following model of j th 
actuator friction force Pf was used by the control package during simulations: 



(23) 



where b is the viscous friction coefficient, Fc is the Coulomb friction force, Fs is 
the maximal stiction force, and _P"' is the external force. 



3.1.4 DC Motor 

Permanent magnet DC motors are used to actuate the manipulator. Mechanical part 
of the motor, i.e., stator, rotor and integrated gearbox are modelled in the multibody 
software (which automatically formulates and solves equations), whereas electrody- 
namics of motor is modelled in the control software (where the user must formulate 
equations). 

A simplified DC motor circuit diagram is presented in Fig. 6. Parameters R and L 
represent resistance and inductance of armature winding, respectively. Ue represents 
the back-emf (electromotive force) which is generated when rotor revolves. Thus, 
the following equation of electric circuit can be formulated [15]: 

U{t) = Ue(t) + Li{t) + RI{t), (24) 

where U is the voltage of power supply, / is the armature current and t denotes time. 
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Fig. 6 Simplified circuit 
diagram of permanent magnet 
DC motor 




The back-emf is proportional to the motor angular velocity m and the torque 
generated by the motor is proportional to the current through the windings, hence: 



Ue = K-a)(t), T = K-I (t), 

where K is the motor constant (the same for both equations). 
Equations (24) and (25) can be combined to obtain: 

/ = (U -KeCO-RI)/L. 



(25) 



(26) 



The above equation is integrated during direct dynamics calculations (for each mo- 
tor). Then Eq. (25) is used to driving torque calculation. The calculated torque Tj 
is sent to multibody software and the instantaneous motor angular velocity coj is 
received from this software. 

During inverse dynamics calculations the desired torque is calculated first: 



Tj = Jljlnjh + P^h. 



(27) 



defined as: 
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where / is the rotor, gearbox and screw reduced moment of inertia and P" is 



(28) 



while Pj is the force obtained by solving linear equations set (21). 

Next, Eq. (25) is employed to calculate the armature current, and finally Eq. (24) 
is used to calculate the required control voltage (finite difference method is used to 
estimate the derivative of current). 

3.1.5 Control System 

The control system scheme is presented in Fig. 4. The control law is designed to 
reduce the control errors on position and velocity levels simultaneously. The control 
currents are calculated to satisfy the following error dynamics equation: 



e + K,,e + KpC 



0, 



(29) 
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where e = L/j — L is the position error (L/j is a six-element vector of desired 
actuator lengths), K^ = fcplexe and Ky = /cylexe are diagonal matrices of control 
gains. The gain coefficients kp and k^ are selected to achieve the critical damping 
of the system described by Eq. (29). 

At the beginning of computation the required motion of the platform is calcu- 
lated. Then the inverse kinematics problem is solved to find the actuators' desired 
lengths L/) , velocities L/j and accelerations L^ . 

In the real manipulator the actual actuators lengths L and velocities L are mea- 
sured by appropriate sensors. In the simulation model these values are computed by 
the multibody package, to provide feedback for the control system model. 

The next step of computations consists in accelerations calculation. For given 
vectors of L/j , L/j , L/j , L and L, the vector of accelerations L, which satisfies 
Eq. (29), is calculated: 

L = Lfl + K,e + K^e = Ld+ K, (L^ - L) + K^ (L^ - L) . (30) 

Then the inverse problem of dynamics is solved. The driving forces necessary to 
produce the required motion (described by La, La and L/j) are calculated. Fric- 
tion forces in actuators are taken into account. The inverse dynamics calculations 
must be preceded by the robot direct kinematic solution, to obtain the platform 
motion. The last step of computations consists in calculation of control voltage for 
all actuators. 



3.2 Results 

The presented simulation model of a parallel manipulator, with ball-screw actuators 
driven by DC motors and with model-based control system, enables to analyze var- 
ious problems concerning the system behaviour. The simulation results provide the 
robot designer with data necessary to make decisions. Since the model-based con- 
trol is employed, this particular study was focused on checking what is the influence 
of dynamics model simplifications and model parameter uncertainties on the quality 
of control process. 

The actuator masses are neglected in the inverse dynamics model, thus the mov- 
ing platform is the only system element with non-zero mass. In the inverse dynamics 
model, which is employed by the control system, the actuator masses may be consid- 
ered in the simplified way, by appropriate enlargement of the moving platform mass. 
The obtained results showed that inverse dynamics model simplifications consisting 
in neglecting the mass of actuators have relatively little influence on position errors 
and even smaller influence on velocity errors. It was also found, that the effects of 
actuators masses neglecting can be significantly reduced by appropriate changes in 
the modelled mass of the platform. Thus, there is no need to use more accurate (and 
requiring much more computation) inverse dynamics model. 
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Employing a friction model in the model-based control inverse dynamics calcu- 
lations can reduce the problems caused by actuators friction. It should be stated, 
however, that friction is a complicated phenomenon and its computational models 
are usually severely simplified. Moreover, the friction parameters are usually diffi- 
cult to measure and quite often are time-varying. Thus, it should be expected that 
the friction model utilized during the inverse dynamics calculations would not be 
accurate. A series of simulations was performed in order to check what is the in- 
fluence of friction forces and friction model inaccuracies on the obtained quality of 
control. In some simulations the friction parameters used by the inverse dynamics 
model were different than friction parameters used by the direct dynamics model 
(underestimated or overestimated). It was found that friction effects should be intro- 
duced to the inverse dynamics model, since it importantly improves the quality of 
control. The simulation results showed that the friction parameters should be identi- 
fied with big accuracy. If the friction model parameters are not accurate enough, the 
quality of control does not improve. It was also found during investigations that the 
parameters describing stiction-friction transition effects are the crucial ones. 

At the end it is worth noting that the presented simulation model can be modified 
with small effort, thus the designer can easily investigate and assess various variants 
of whole robot or its subsystems. 



4 Dynamic Analysis of a Flexible Power Transmission 
Mechanism 

General multibody formalism can be also used for robot modelling when flexibility 
effects must be taken into account. The MBS code is applied on early stage of robot 
synthesis and, using general purpose tools, allows predicting various characteristics 
of mechanisms. 

The most common approach adapted in many commercial and research MBS 
packages is based on floating frame of reference formulation (FRF). In this approach 
models of flexible bodies are usually prepared using classical FEM formulations and 
then transferred to MBS code as substructures in various representations. Kinematic 
and dynamic analysis is most often performed directly in MBS program. The well 
known drawback of floating frame formulation is the restriction that deformations 
of bodies should be small (although displacements are large). Moreover, in prac- 
tice substructure is represented by only dozen or so first modes, and the change of 
number of modes requires additional time-consuming calculations. 

In this chapter we present dynamic analysis of a flexible parallelogram mech- 
anism. This mechanism is used for power transmission in POLYCRANK robot 
designed in the Warsaw University of Technology [16]. A general flexible MBS 
method, based on FRF was used. A method of joint friction modelling that accounts 
for assembly stresses was also applied. Output functions, i.e. angular velocities and 
accelerations of selected bodies, friction and reaction forces in kinematic pairs, as 



146 



J. Frgczek and M. Wojtyra 



well as stress maps in selected bodies are presented. In addition, a discussion of 
different integration methods influence on the results accuracy is briefly presented. 
All computations were carried out using CAD/CAE packages commonly used in 
virtual prototyping [17], 



4.1 An Outline of Dynamic Analysis of Flexible MBS 

In classical absolute coordinates formulations an intermediate body-fixed local ref- 
erence frame (called a. floating frame) is often introduced to describe large motion 
and small deformations of a body. Six generalized coordinates representing large 
motion of the body (e.g., three Cartesian coordinates of the centre of floating frame 
and three Euler angles) describe local body reference frame position and orientation 
with respect to the global (fixed) reference frame. Body deformations are described 
using FEM linear theory with respect to the local reference frame. Equations of mo- 
tion of the whole system may be written using, for example, Lagrange equations of 
the first kind in the form [18]: 
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where: q = r^ ^ q^^ = q|^ q^ is a vector of the rigid and flexi- 
ble general coordinates, m are mass matrices, D is a dumping matrix (resulting 
from dissipation function), $ is a vector of constraint equations imposed by joints 
connecting bodies, ^T A is a vector of reaction forces represented by Lagrange mul- 
tipliers, Q, is a vector of external forces applied to the body, Qy is a vector of 
centrifugal, Coriolis and other forces which result from differentiation of the kinetic 
energy with respect to time and coordinates. 

Equations of motion (31) form a system of differential-algebraic equations 
(DAE) with large number of unknowns, which depends directly on the number of 
FEM model degrees of freedom. That is why, in general, reduction of degrees of 
freedom is performed with usage of the modal synthesis algorithm. One of the most 
common approaches in the structural dynamics is the Craig-Bampton method [19]. 
Using the Craig-Bampton modes system of equations (31) can be represented in the 
new modal coordinates: 
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where ^ is a rectangular transformation matrix, p is a new vector of coordinates 
obtained after coordinate transformation, I is an identity matrix and d is a matrix of 
modal damping. 

It is worth noting that: 

• The size of the system of equations (32) depends on the number of modal coordi- 
nates considered in transformation matrix ^ . From numerical point of view, size 
of this matrix should be small. However, it cannot be chosen arbitrarily, since 
matrix ^ contains information about modal content of solution. 

• The system of DAE equations (32) has high differential index (equal to 3). For 
this kind of system many integration methods exist; it should be pointed out that 
effectiveness of the numerical integration of this system depends not only on the 
integration algorithm but also on the form of the system and method of index 
stabilization. 



4.2 Power Transmission Mechanism of POLYCRANK Robot 

The floating reference frame approach was implemented to model fragments of 
power transmission mechanism of the POLYCRANK robot [16]. This robot pro- 
totype was built in Warsaw University of Technology. Unique feature of this 
construction is a possibility to perform unlimited rotational motion in almost all 
actuated joints. The robot links have the shape of diagonal cranks. Those cranks 
have coating construction with composite shields. Direct drive motors, situated at 
the robot base, are used to actuate the manipulator. The motors are connected to 
driven links via internal power transmission mechanisms, built of spatial parallelo- 
grams. The main parts of investigated mechanisms are shown in Fig. 7. The robot 
has six degrees of freedom and consists of seven links. 

The complete geometrical model of the power transmission mechanism frag- 
ment, mounted into parts 4 and 5, is shown in Fig. 7b and was built in CAD 
environment. The power is transmitted via system of parallelograms. Each spatial 
parallelogram consists of: 

• Rotational discs, connected mutually with cross-roller bearings 

• Composite shields, which serve as protection of the power transmission mecha- 
nism and make the construction stiffer 

• Six connecting rods and additional elements for bearings fixing 

Two parallelogram mechanisms are placed inside robot part 4 (Fig. 7a) and other 
two inside part 5. They transmit power from direct the drive motors to the gripper. 
The fifth, outer parallelogram mechanism transmits power to part 5, enabling its 
unlimited rotation about vertical axis. 

Figure 8a illustrates part 4 with visible outer parallelogram mechanism and two 
inner parallelograms covered by the shield. Parallelogram linkages consist of paral- 
lel discs and rods which transmit power between discs. Each parallelogram linkage 
has six rods (each pull rod is ended with two heads). 
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Fig. 7 POLYCRANK robot and its elements [16]: (a) General view (0 - robot base, 1, 2, 3, 4, 5, 6 - 
robot parts, 7 - gripper), (b) power transmission mechanism 





IW^ 




Fig. 8 Power transmission mechanism: (a) part 4 CAD model, (b) kinematic scheme of five 
parallelograms 



Kinematic scheme of the five-parallelogram mechanism is presented in Fig. 8b. 
For simplicity, only two rods of each parallelogram linkage are considered. 

A flexible MBS model of the power transmission mechanism was built and used 
for estimation of loads characteristics and body stresses for different variants of 
motion. 

Due to the high complexity of robot mechanisms, some simplifications were 
made in the model. Only pull rods and shields were modelled as flexible. Flexi- 
bility of bearings and joints was neglected. The small parts, which do not influence 
on the robot dynamics, like screws or washers, were not included in the model. 
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Flexible models of the shield and the double-nutted pull rod were prepared us- 
ing FEM package [20]. The CAD geometry models of shields were meshed using 
SOLID186 finite elements to obtain flexible models in FEM package (Fig. 9a). In 
order to connect body to other elements (discs) spider webs (beam elements) were 
created. 

The simplified mesh of the rod built of SOLID45 elements is shown in Fig. 9b. 
Rigid body elements were used for simplified joint modelling (at both ends "spider 
webs" were created). For both types of models (shield and pull rod) substructures 
were created using the Craig-Bampton technique [19]. Modal content of flexible 
bodies was verified by comparison of modal frequencies calculated separately in 
FEM and in MBS [17, 20] environment. 

All parts in the model were assembled using MBS techniques, to create a 
power transmission mechanism according to kinematic scheme presented in Fig. 8b. 
Additional point mass (5 kg), representing pay load carried by the robot (gripper and 
manipulated object), was placed on the upper disc of the robot. All further calcula- 
tions were carried out in MBS environment. 

In the first step, a modal analysis of the whole manipulator with blocked rigid 
degrees of freedom was performed for various selected configurations. Knowledge 
about the lowest frequencies level of the mechanism natural vibrations is often 
important from the point of view of control system synthesis. Modal shapes cor- 
responding to the first modal frequency (about 1 10 Hz) are shown in Fig. 10. Modal 
shape, corresponding to this frequency, represents bending of the construction along 
horizontal axis. 




Fig. 9 FEM models: (a) part 5 shield, (b) pull rod 




Fig. 10 Parts 4 and 5 modal shapes corresponding to the lowest modal frequency 
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In the next step, several dynamic analyses of the robot mechanism were per- 
formed; various time courses of driving torques were taken into consideration. 
The following quantities were analyzed: 

• Angles of rotations, angular velocities and accelerations of robots parts. 

• Reaction forces in joints with and without friction. 

• Pull rods von Mises reduced stresses. 

Figures 12 and 13 show the results obtained for mechanism driven by trapezoidal 
input torques (presented in Fig. 11). The level of stresses in links appeared to be 
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Fig. 13 Reaction forces in joint connecting cap and disc 



less then several dozen of megapascal. Presence of friction and rods' preloads do 
not change stresses significantly, however, it can be noticed (Fig. 13) that reaction 
forces in joints become considerably greater when joint friction and initial loads 
are taken into consideration. Moreover, friction and preload result in energy losses 
bigger than in other cases (Fig. 12). 

Finally it is worth noting that the Gear algorithm with the GGL technique of 
constraint stabilization was used for integration of DAE equations of motion. This 
algorithm was chosen on the basis of the initial tests which proved its good numeri- 
cal efficiency. 



5 Conclusions 



General MBS methods can be used to estimate different robot characteristics. These 
methods can be useful in case of kinematic, as well as dynamic analysis. When 
combined with other CAE methods, the general multibody approach can be used in 
interdisciplinary calculations, and thus complex robotic systems can be analysed. 

General multibody methods do not require closed form kinematics solution, 
moreover, calculations might be helpful when robot singular positions are analysed. 
These methods can play important role during simulation and analysis of complete 
robot with actuators and control system. They also become a valuable tool when 
effects of flexibility and joint friction should be incorporated into model. 

The other important feature of general MBS formalism is that models can be 
easily parameterized. This makes robot performance evaluation more convenient 
and allows analysing various parameters. Moreover, design optimization can be per- 
formed easily from different points of view. 
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Energy Considerations for the Stabilization 
of Constrained Mechanical Systems 
with Velocity Projection 

Juan C. Garcia Orden and Roberto A. Ortega Aguilera 



Abstract There are many difficulties involved in the numerical integration of 
index-3 Differential Algebraic Equations (DAEs), mainly related to stability, in the 
context of mechanical systems. An integrator that exactly enforces the constraint at 
position level may produce a discrete solution that departs from the velocity and/or 
acceleration constraint manifolds (invariants). This behaviour affects the stability 
of the numerical scheme, resulting in the use of stabilization techniques based on 
enforcing the invariants. A coordinate projection is a poststabilization technique 
where the solution obtained by a suitable DAE integrator is forced back to the in- 
variant manifolds. This paper analyzes the energy balance of a velocity projection, 
providing an alternative interpretation of its effect on the stability and a practical 
criterion for the projection matrix selection. 



1 Introduction 

Many engineering applications involve the dynamics of several bodies, rigid or 
deformable, undergoing large motions. Very often the motion of these systems is 
constrained, because there are joints than connect the different parts, or due to pre- 
scribed displacements imposed by the environment. 

The mathematical models associated with these type of systems are typically 
formulated in terms of index-3 Differential Algebraic Equation (DAE) systems, 
composed of a set of differential equations, plus a set of algebraic constraint 
equations expressing additional relations among the generalized coordinates of the 
model. The numerical solution of these systems poses several difficulties, mostly 
related to the stability of the available integration schemes. 
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Direct integration of DAEs with an index higher than one is usually not per- 
formed due to stability problems [8], although there have been some recent 
successful applications based on a second order generalized-a method applied 
to index-2 and index-3 DAEs (see [2] and references therein). On the other hand, 
index reduction through the analytical differentiation of the constraint equations 
causes the progressive drift of the computed solution from the position, velocity or 
acceleration constraint manifolds (which are invariants of the system) during the 
simulation. This is the point of departure of several stabilization methods found in 
the literature [3,4,6]. 

A coordinate projection is a poststabilization technique based on the solu- 
tion of a constraint minimization problem, enforcing the solution obtained from 
the integrator back to the invariant's manifold. This technique has been studied 
and successfully applied to practical mechanical models by several researchers 
[1, 5, 7, 9, 10, 12, 13, 22, 25]. A detailed analysis and a discussion of the applica- 
bility of this technique can be found in the references and is beyond the scope of 
this paper. Nevertheless, two relevant aspects, related to the performance of this 
technique as applied to mechanical problems, are not found in the literature. 

The first aspect is the relationship between the projection and the mechanical 
energy balance. It is desirable, from a physical point of view, for the behaviour of the 
energy of the numerical solution to be consistent with the energy of the continuous 
model. But this aspect is important also from the algorithmic point of view, due to 
the close relationship between the behaviour of the discrete energy computed by a 
numerical scheme and its stability [26] . 

The second aspect is the selection of the projection matrix. Eich [12] proposes 
an orthogonal projection on the null-space of the invariant, while in [1, 7, 9, 25] a 
mass-orthogonal projection is employed, and tested in several examples with very 
good results. References [10, 1 1] propose a projection based on the mass matrix plus 
other terms related to the linearized damping and elastic forces of the system, which 
is numerically more efficient. In fact, from a purely mathematical point of view, 
any positive definite matrix qualifies for a coordinate projection, which justifies the 
interest in searching for a practical criterion for selecting the projection matrix. 

This paper focuses on these two aspects, analyzing first the energy balance in- 
volved in a coordinate projection on velocities. The results of this analysis provide 
an alternative interpretation of the performance of the technique, leading to a prac- 
tical criterion for the matrix selection. 



2 Constrained Dynamics Formulation 

The point of departure is the formulation of the dynamics of a mechanical system 
with a configuration defined by the set of generalized coordinates q e R" , under the 
action of applied forces f(q, q, t) and subjected to a set of r holonomic constraints 
$ : M" X [0, r] ^ M^ such that $(q. t) = 0. 
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The Lagrange multiplier method leads to an index-3 DAE system given by: 

Mq+$^A = Q, $=0, (1) 

M being the mass matrix, A e M'' the vector of Lagrange multiphers, and denoted 

def * def 

by ( )q = 9( )/9q and ( ) = d( )/df . The vector of generalized forces Q(q, q, t) 
accounts for the applied forces f and additional terms (gyroscopic, etc.) that may 
appear due to the particular type of generalized coordinates. If q are Cartesian coor- 
dinates of selected points of the system, these additional terms vanish and Q = f . 

An exact integration of the index-3 DAE system (1) in its original form (meaning 
that no index reduction is performed) would provide a solution that exactly satisfies 
the constraint at position level ($ = 0). In this case, the constraints at velocity and 
acceleration levels would also be automatically exactly enforced. This means that a 
computed solution q(?) would automatically verify ^ = <> = 0, with no further 
considerations; we call these invariants of the system. 

But this situation does not hold in general for the direct numerical solution due to 
the approximations introduced into the computations. This means that, even though 
the computed solution satisfies the constraint at position level in a numerical sense 
(meaning that its error is below the machine precision), the solution may signifi- 
cantly violate the constraint at velocity and acceleration levels. 

A similar situation arises when the index of the DAE system (1) is reduced by 
means of a double differentiation of the constraint equation, leading to the (under- 
lying) ODE system. In this case, the numerical integration provides a solution that 
satisfies the constraint at acceleration level (^ = 0), but progressively violates the 
constraints at position and velocity levels. 

These facts justify the search for algorithms that force the numerical solution to 
remain on all the invariant's manifolds. This is the point of departure of different 
stabilization methods proposed in the literature; one of them is a particular poststa- 
bilization technique known as coordinate projection. 



3 Coordinate Projection 

With this technique, a time-stepping method is applied to (1) in order to obtain a 
solution for each time step, followed by a projection to bring the solution back to 
the invariant manifold. 

In the case of a velocity projection, the velocities q* computed with the integra- 
tor are projected onto the velocity constraint manifold to obtain new velocities q, 
solving a constrained minimization problem given by: 

min-(q-q*)'^A(q-q*) subject to 4=0, (2) 

q 2 
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A being a symmetric and positive definite matrix. This minimization problem can be 
solved with different methods. For instance, in [25] a Lagrange multiplier method 
with a Newton-type iteration is employed, while in [7] an augmented-Lagrange 
method is used. In [ 1 0] a penalty method is used with excellent results. In the present 
paper we choose the same approach, a penalty method, since it allows us to ob- 
tain a closed expression for the projected velocities, while performing an efficient 
projection. 

The penalty method transforms the constrained problem (2) into an uncon- 
strained one, introducing a penalty parameter a > and leading to an algebraic 
equation for q given by: 

A{q-q*) + 4>la4>=0 (3) 

The terms ^ and ^q can be further elaborated as: 

9$ 3$ • 9^ 

And assuming that the constraint does not explicitly depend on time ($( = 0), from 
(3) the following linear algebraic system for the unknown q is obtained: 

(A + a<S>l<i>^) q = Aq* (4) 

Remark 1. The fact that the projection matrix A is positive definite and a > 
guarantees that the linear system given by (4) is non-singular, which means that the 
projected velocities q are always computable. In order to justify this proposition, it 
is only necessary to employ some standard linear algebra results, which will be used 
here without further proof. Recalling that A is positive definite and using the fact 
that ^n ^q is positive semidefinite, the following relation holds for all x 7^ 0: 

x'^ (A -I- a$q$q) X = x'^Ax -l-a x"^ (^q^q) X > , 

which means that matrix (A -I- a^^^q) is positive definite and, as a consequence, 
it is non-singular. 

Note that the use of the penalty method to solve the minimization problem (2) 
is approximate, in the sense that, in general, the projected velocities q do not ex- 
actly lie on the velocity constraint manifold ^. In fact, it can be shown that, if a 
projected velocity q satisfies the velocity constraint, it is because the original ve- 
locity q* already satisfied this constraint. This assertion is justified in the following 
proposition: 

Proposition 1. If the velocity before projection fq*J or the velocity after projection 
fqj satisfies the velocity constraint $ = 0, then q* = q. 

A detailed proof of this proposition can be found in [15]. 
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Projections can also be performed at the position and acceleration levels. For 
instance, in the case of an acceleration projection, the accelerations computed with 
the ODE integrator (q*) are projected onto the acceleration constraint manifold to 
obtain new accelerations (q), solving a constrained minimization problem given by: 

min-(q-q*)^A(q-q*) subject to <> = 0, (5) 

q 2 

A being a positive definite matrix.' Again, this constrained minimization problem 
can be solved with penalty, which leads to the solution for q as a linear algebraic 
system given by: 

(A + Q!$^^q)q = Aq*-a$^iqq (6) 

The analysis of position and acceleration projections is outside the scope of this 
paper, and only a velocity projection will be considered. This is justified by the re- 
sults reported in [1,22,25]. These authors show that errors in the velocity constraint 
are more critical for the numerical solution than errors in the position constraint, 
coming to the conclusion that velocity projection is the most efficient projection for 
improving numerical integration. 



4 Total Energy Balance 

For systems of ODEs arising from the dynamics of mechanical systems, the sta- 
bility of the numerical methods used to solve them is often related to the concept 
of energy. Actually, in the linear case, exact algorithmic energy conservation leads 
to unconditional stability, as happens, for instance, with the trapezoidal rule [23]. 
However, this direct relationship does not hold for the nonlinear case [24,26], which 
is the case of the equations resulting from practical multibody systems. Neverthe- 
less, exact conservation of energy (or unconditional energy dissipation) has revealed 
itself to be extremely useful in the design of robust integration schemes, with excel- 
lent stability in the nonlinear case ([26] and references therein) and applied to the 
dynamics of multibody systems [16-18,20]. 

With these arguments in mind, it is interesting to analyze how the coordinate 
projection behaves in terms of energy balance. As will be shown below, it turns out 
that the projection actually controls the energy, therefore providing a new point of 
view for the understanding of its stabilization properties. 

In order to establish a suitable point of departure, let us consider a constrained 
mechanical system, represented by a set of coordinates q e M" , subjected to a set 
of r holonomic constraints $(q) e W and without applied forces. The dynamics 
of this system are represented by the index-3 DAE: 

Mq + Q*(q) = Q, 4» = (7) 



' Not necessarily the same employed for the velocity projection. 
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Q$ being the constraint force vector, which in the case of the Lagrange multipUer 
method is given by Q$ = ^^A. The generalized force vector Q vanishes if q 
contains Cartesian coordinates of selected points of the system. 

Remark 2. The fact that no applied forces (e.g. external loads or internal forces in 
discretized deformable bodies) are considered in (7) does not limit the applicabil- 
ity of the developments presented in the next sections. This is due to the fact that 
the velocity projection does not affect the work performed by these forces, which 
typically depends only on positions. 

Remark 3. The dynamical system represented by (7) is conservative (the total me- 
chanical energy remains constant), since the work performed by the holonomic 
constraints which do not depend explicitly on time is zero. 

Directly integrating the index-3 DAE (7) from ?„ to ?„+i provides a solution 
Qn+i that exactly satisfies the position constraint. In consequence, the constraint 
force at f„+i takes the value Q$,,_i_j = $q A„+i, A„+i being the vector of exact 
Lagrange multipliers. 

A velocity vector q*_|_i is also obtained, but in general, the velocity constraint 
^„+i is not exactly satisfied. In order to move the solution back to the velocity 
constraint manifold, let us assume that a velocity projection is performed at the end 
of each time step as explained in Sect. 3, obtaining a new velocity vector q„+i. 

The total discrete energy balance AE between t„ and f„+i is given by: 

AE = iqT^iMq„+i - ^q;^Mq„ (8) 

Note that the energy balance AE given by (8) equals the kinetic energy balance. 
This is due to the fact that there are no applied forces, the position constraints are 
exactly satisfied, and the position q«+i does not change under the projection. 

Adding and subtracting a term (l/2)q*_,_jMq*_,_j in (8), the following relation is 
obtained: 

^E = -C+iMC+i - ^qlMq„ + -qT^iMq„+i - -q:;iMq:+i, (9) 
AEi AEp 

AEj being the energy variation introduced by the ODE integrator, and AEp the 
energy variation introduced by the velocity projection. 

It is not difficult to obtain an expression for the energy variation AEj intro- 
duced by a standard ODE integrator. The point of departure is the first term of (9) 
rewritten as: 

AEi = ^(C+i + q«)^M(q:+i - q„) (10) 
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and using the algorithmic expressions of the method with the original system (7). 
For instance, for the trapezoidal rule the following relations hold: 

2 

q*+i + qn = ^(q«+i-q«) 
q:+i - q« = - y^'MQt,, + Qt„+0 

with Qt = Q* — Q, which introduced into expression (10) give, after some algebra: 

AEi =-(q„+i-q„)^QT , ,, (11) 



where the notation (On+i = [(•)« + (On+i] /2- has been employed. 

Another example is the implicit midpoint rule, which introduces an energy vari- 
ation given by: 

AEi = -(q„+i - qnV Qr . (12) 

where (■)„+! denotes evaluation at the midpoint. Note that, in a general nonlinear 
case, Qx y^ Qt r and AE, ^ can be positive or negative. Note also from 

'1+2 "+3 

(11) and (12) that both numerical schemes are the same and exactly conserve energy 
(AEi = 0) in the linear case. 

Another interesting example is a conserving algorithm, which does not introduce 
artificial energy by means of a specific formulation of the force Q^: 

Z\£,- =-(q„+i-q„)TQ^ = (13) 

Details about the formulation of Q^ with Cartesian coordinates (Qt = Q*) em- 
ploying the Lagrange multipliers method and the augmented Lagrange multipliers 
method can be found in [21] and [19], respectively. 

Other expressions similar to (11-13) can be obtained for other integrators, but an 
exhaustive description falls outside the scope of the work presented here. It is im- 
portant to remark that the sign of the energy contribution AEi may not be constant 
throughout the simulation, thus increasing or decreasing the total energy, which can 
in turn affect numerical stability. 

The second contribution to the energy variation is AEp, associated with the ve- 
locity projection described in Sect. 3, and can be obtained solving a minimization 
problem with a definite positive matrix A using a penalty method. This leads to the 
solution for qn+i of the linear algebraic equation system (4), given by: 

q„+i=p-iq:+i with P=(l + aA~i<&T$q) (14) 

Introducing the first expression in (14) in the following relation for AEp: 

^Ep = -(q„+i +C+i)^M(q„+i -q*+i) 
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an expression is obtained for the energy variation introduced in the velocity 
projection: 

AEp=ql^,T)q„+i with D = ^ (1 + P^MCl - P) (15) 

Therefore, the effect of the projection upon the energy depends of the properties of 
the matrix D, which is the matrix associated with the quadratic form AEp, and gov- 
erns the damping behaviour of the projection. If this matrix is negative semidefinite, 
artificial energy growth is avoided in all cases, and a significant improvement in the 
stability of the overall numerical scheme would be expected. 

In what follows, a detailed analysis of this projection energy balance is per- 
formed, which will provide a practical assessment of the suitable choice for pro- 
jection matrix A, so that artificial energy growth is unconditionally avoided. 



5 Projection Energy Balance 

5.7 Some Preliminary Results 

The point of departure is to perform a quick inspection of the basic properties of the 
damping matrix D based on its definition (15) and some basic linear algebra results. 
It follows that matrices A~^ and ^I^q are symmetric and positive semidefinite. 
However, matrix P defined in (14) is not, in general, positive semidefinite, or even 
symmetric. As a consequence, the damping matrix D given by (15) will not be sym- 
metric, and nothing can be said in general about its definiteness. This means that, 
following this procedure, it is not possible to bring the sign of the energy balance 
AEp forward at each time step. 

Nevertheless, it is possible to get more information about the quadratic form 
AEp as explained in the following proposition: 

Proposition 2. The quadratic form AEp given by (15) is degenerate, i.e. its 
kernel /Cq.' 

/Cd = {xeM"; y'^Dx = 0. Vy gM"} (16) 

contains other vectors than the zero vector Specifically, the set C of velocity vectors 
which are compatible with the constraint $.■ 

C = {c|gM"; 6 = $qq = 0} 

is a subset of the kernel, thus C C /Cd- 

A detailed proof of this proposition can be found in [15]. 

This result was already expected, recalling from Proposition 1 that projected 
velocities that are compatible come from compatible original velocities, which 
means that projection leaves them unchanged. Note also that this result does not 
exclude the possibility that incompatible velocities may be undamped; in other 
words, C may not coincide with JCd ■ 
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Summarizing, this preliminary inspection reveals that there are few things to 
say about the damping introduced by the projection with a general definite positive 
projection matrix A, apart from the expected fact that compatible velocities never 
introduce artificial energy. 

The next step is to try to find a set of requirements such that, if satisfied by the 
projection matrix A, they would determine the behaviour of the projection energy 
balance. If achieved, this result would help in the selection of the projection matrix, 
which is one of the main goals of this paper. 



5.2 Conditions for Energy Dissipation 

It is a basic linear algebra result that any quadratic form may always be expressed 
in terms of a symmetric matrix. This means that, in order to analyze the properties 
of the quadratic form given by (15), it is possible to work just with the symmetric 
part of matrix D. Denoting the symmetric and skew-symmetric parts of the original 
matrix by superscripts s and h, respectively, this result may be expressed as: 

x^Dx = x^D^x + x^d'^x for all x e R" , (17) 



where the symmetric matrix D^ can be expressed, after some algebraic manipula- 
tions using definition (15), as: 

D^ = - (D -F D^) = - (M - P^MP) 

Remark 4. Matrices D and D^ both have the same definiteness property (definite, 
semidefinite, etc.), as immediately follows from (17); and their associated quadratic 
forms have the same kernel /Co = A^D'^ ■ 

Matrix D'' can be further elaborated and written in terms of the projection matrix 
A and the Jacobian $q using (14) for P, obtaining: 



D^ = -a (b' + -aB^M-^B j 



(18) 
\ z ) 

B being a matrix given by: 

B = MA-i$^^q, (19) 

and again denoting the symmetric part of the matrix by the superscript s. Based on 
(18), the energy balance of the projection may be expressed as: 

AEp = q^D^q = -aq^B'q - -a^ (Bq)^M"^ (Bq) 

= AEp^+AEp^. (20) 
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Note that, since the penalty method is employed to solve the minimization 
problem (2), the projected velocity vector q may be incompatible (meaning that 
it may not lie exactly over the velocity constraint manifold ^). 

Thus, the projection energy balance is positive or negative depending on the sign 
of each term AEp^ and AEp^ in (20) for an incompatible projected velocity q^ C. 
(Recall from Proposition 2 that AEp = for a compatible velocity q e C.) 

In [15] a detailed analysis of these terms is carried out, proving the following 
important proposition: 

Proposition 3. If q is incompatible and matrix W = (MA~^$^$q) is positive 
semidefinite, the projection introduces non-negative energy dissipation, AEp < 0. 

What is more, if matrix (MA~^)'' is definite, then the projection introduces posi- 
tive energy dissipation, AEp < 0. 

The next proposition justifies the positive performance of projections based on 
the mass matrix M: 

Proposition 4. A velocity projection performed with the mass matrix (A = Mj 
introduces non-negative energy dissipation. 

We are immediately able to prove this based on the previous results, because in this 
case: 

is always a symmetric and positive semidefinite matrix. Additionally, in this 
case (MA~^)'' = 1, which is a definite matrix, guaranteeing energy dissipation 
{AEp < 0) for incompatible velocities. D 

Next, two numerical experiments are presented in order to verify the theoretical 
results outlined in the previous sections. 



6 Numerical Experiments 
6.1 Two Particle System 

Let us consider a mechanical system composed of two particles with masses mi = 1 
and m2 = fJ. > moving along a smooth horizontal line, as depicted in Fig. 1. The 
configuration of the system is defined by the vector of coordinates q = (^i, ^2)^ 
containing the distances of the particles from a fixed point on the line. In addition, 
there is a holonomic constraint ^(q) = q^q — 1 = ^^ + ^| — 1 =0, and as a 
consequence the system has only one degree of freedom. 

The motion starts at f = from position qo = (0, 1)^ with velocity qo = (1,0)^. 
Taking into account that <I>q = 2q, it is easy to verify that the constraints at position 
and velocity levels are satisfied at f = 0: 

«'o = qjqo -1=0, <i>o = <5^qo = 2(0, 1) • (1,0)^ = 
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Fig. 1 Two-particle example 
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Fig. 2 Conserving integration with and without projections, At = 0.1 s 



The motion is integrated up to 10 s with a conserving augmented Lagrangian 
scheme in position with a penalty of 10^ (see [19] for details of the formulation), 
such that the constraint at position level is exactly satisfied with exact energy con- 
servation, AEi = 0, as expressed in Sect. 4 with expression (13). No projections 
are performed. 

Figures 2a,b show, respectively, the evolution of the position q and velocity q 
in time for /x = 0.1 with a constant time step At = 0.04 s. Figure 2c shows the 
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discrete energy, which is exactly constant, as expected, and Fig. 2d shows the La- 
grange multiplier, which is related to the constraint force by Q* = ^^X = 2Aq. 

Figures 2e,f show the constraints at position and velocity levels. It is possible to 
observe that the position constraint remains small (~10~^), but Fig. 2f shows that 
the velocity constraint is much larger (~10~^), as expected since no projections 
are performed. Nevertheless, the energy control performed by the integrator seems 
capable of handling this undesirable effect, avoiding a noticeable increase in the 
velocity constraint violation during the integration. 

A second set of experiments is performed next, using a larger time integration 
step At =0.1 s. As shown in Fig. 3a, the integration fails to converge at f = 5.2 s 
when no projections are performed, despite energy being exactly conserved. 

Figure 3b shows that the instability at the end of the integration is related to the 
large oscillations on the Lagrange multiplier. Figure 3c shows that the position con- 
straint is satisfied up to the failure, as expected, but the violation of the velocity 
constraint shown in Fig. 3d is larger and exhibits a growing trend. Thus, it is reason- 
able to conclude that the growth of the velocity constraint violation is the ultimate 
cause of the instability that produces the ultimate failure of the integration. 

If a velocity projection with (2) and (4) is performed, the velocity constraint may 
be significantly reduced and the integration may be carried up to f = 10 s, but the 
result depends on the projection matrix A employed. 
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Fig. 3 Conserving integration with and without projections, At = 0.1 s 
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Two matrices are tested: the mass matrix A = M = f | with u = 0. 1 and a 

positive definite matrix A = W of the form: 

W=K°| with a>0, 



fa 0\ 

Voij 



with a = 15 and a projection penalty parameter a = I. This penalty parameter 
is far too small for practical simulations, but it will serve to more clearly show the 
performance of the projection technique. 

Figures 3c,d show that, as expected, both projection matrices accomplish the task 
of reducing the velocity constraint, and they also stabilize the integration such that 
it may be carried up to the end time. But Fig. 3a shows that there are important 
differences in the behaviour of the energy: while M avoids the growth of energy 
(as predicted in Proposition 4), matrix W causes an artificial growth of energy. This 
behaviour is justified by the properties of matrix B^, which are in this case: 



^ \M\q2 Mi I 

B^ = i(B + BT)=2f 2?i/" 



2 
2 



jd) l[iq^ 



It can be shown that matrix B'^ is not positive semidefinite, because its determi- 
nant is: 

det(BO = 4 (^q\<li^ - qWi (/^ + ^) = -^'^1^2 (1 - aiJ-f < 

According to Proposition 3, this means that the energy may increase or decrease, 
which is in fact the behaviour shown in Fig. 3a. 

Looking more closely at the projection performed with matrix W, a quick in- 
spection of matrix MW~^ reveals that it is definite: 



V 






which means that there could be incompatible velocities q associated with a projec- 
tion that does not modify the energy (AEp = 0). 

Finally, in order to study the effect of the penalty parameter a in relation to 
the projection energy balance, a new experiment is performed with the same pro- 
jection matrix W and a larger projection penalty parameter, a = 20. Figure 4a 
shows the behaviour of the velocity constraint $ in a long simulation, up to 50 s. 
The projection with M and a = 1 retains a small violation of the velocity con- 
straint, producing a stable integration although introducing significant dissipation 
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Fig. 4 Conserving integration with projections, At = 0.1 s 




Fig. 5 Five-bar pendulum, initial configuration 

(Et=5o/ E[=o ~ 0.74). On the other hand, the projection with W and a = 20 con- 
trols the violation of the constraint fairly well at the beginning of the simulation; but 
it fails to keep it small, with integration failure occurring at f = 35.5 s. The large 
amount of energy introduced by the projection {£1=^5,5/ Et=a — 2.54) shown in 
Fig. 4b is responsible for the growth of the velocity constraint's violation and the 
ultimate failure of the computation. 

Comparing Figs. 3a,b, it is clear that the artificial energy added to the system by 
projecting with W increases significantly when the projection penalty parameter a 
increases. While the energy at the end of the computation for a = 1 is Et=io — 
0.53 J (as shown in Fig. 3a), the energy for a = 20 shown in Fig. 4b is Et=\a — 
0.67 J. 



6.2 Five-Bar Pendulum 



This example is analysed as a benchmark problem in several references [7, 14]. The 
system is composed by five prismatic bars with unitary length, section 0.1 x 0.1 m'^ 
and unitary masses, linked by revolute joints, being one of them fixed. The system is 
released from rest under the action of gravity g = 9.81 m/s^ from the configuration 
shown in Fig. 5. 
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The configuration is defined by a set of dependent coordinates collected in vec- 
tor q e R*°, which are inertial cartesian coordinates of four non-coplanary points at 
each bar. These coordinates are related by a set of 55 constraint equations collected 
in vector $, where 6x5 = 30 equations correspond to the constant-distance con- 
straints among the four points of each bar. There are 25 more equations related to 
the five revolute joints, with five constraints for each joint. Based on these consider- 
ations, the system has a total number of five degrees of freedom. 

The motion is integrated during 10 s with the trapezoidal rule, employing an 
augmented Lagrangian method for the enforcement of the position constraint and 
using different time steps and velocity projections. The matrices employed for the 
projections are the mass matrix M and the identity matrix 1. As pointed out in 
Sect. 3, from a purely mathematical point of view both matrices are suitable for 
projection, as long as both are symmetric and positive definite. However, it will be 
shown next that they perform very differently in terms of stability, justified by the 
behaviour of the artificial energy that they introduce. 

A first numerical experiment is performed with a time step At = 0.005 s. 
Figure 6 shows that the integration fails at t ~ 2 s if no projections are performed, 
showing a dramatical increase of energy which is clearly related to an abrupt in- 
crease of the velocity constraint. If a projection is performed with the mass matrix 
M and a projection penalty parameter a = 10^, the velocity constraint remains 
small during all the computation and the energy remains very close to the theoreti- 
cal constant value. 

Note that the projection with M introduces artificial dissipation (which is always 
positive, as justified in Proposition 4), but the trapezoidal rule itself introduces some 
artificial energy too, which may be positive or negative, as remarked in Sect. 4. This 
explains the behavior of the total energy observed in Fig. 6a, which is not mono- 
tonically decreasing as in the case of the conserving integrator employed in the 
two-particle example presented in Sect. 6.1. 

On the other hand. Fig. 6b shows that the projection performed with the identity 
matrix succeeds in enforcing the velocity constraint. However, Fig. 6a reveals that it 
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introduces a significant amount of artificial positive energy, which is unrealistic and 
may eventually produce a failure of the computation. This behaviour is fully justified 
by Proposition 3, since B'' = (M^^^q)'' is not, in general, a positive semidefinite 
matrix. 

The results of a second experiment performed with a larger time step. At = 
0.01 s, are shown in Fig. 7. These results are very similar to those presented in Fig. 6 
for a smaller time step. Figure 7b shows that both projections succeed in enforcing 
the velocity constraint, but there is a significant difference in the behaviour of the 
energy, as shown in Fig. 7a. It is apparent that the increment of the time step ampli- 
fies the growth of the artificial positive energy introduced by the projection with the 
identity matrix, deteriorating the overall performance of the computation. 

A third set of experiments is performed in order to explore the effect of the pro- 
jection penalty for a fixed time step. Figure 8 shows that the overall performance 
of the mass matrix projection is very good. Figure 8a reveals that the energy is not 
very affected by the projection penalty value, while Fig. 8b shows that the larger 
projection penalty improves the satisfaction of the velocity constraint, as expected. 
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Fig. 9 Five-bar pendulum, trapezoidal rule with At = 0.01 s, projection with identity matrix 



Figure 9 shows the results obtained with the identity projection matrix. Figure 9b 
shows that the larger projection penalty improves the satisfaction of the velocity con- 
straint, as expected. But note in Fig. 9a how this effect does not help to significantly 
reduce the growth of the artificial positive energy introduced by the projection. 



7 Conclusions 



The main conclusions that may be drawn from the developments presented in this 
work are: 

• A velocity projection, solving a minimization problem based on a positive def- 
inite matrix and using a penalty method, succeeds in maintaining the numerical 
solution of the index- 3 DAE system close to the velocity constraint manifold 
^ = 0. This projection has a stabilization effect that has been reported in the 
literature and tested with a simple example in this paper. 

• The velocity projection may introduce some artificial energy into the system. If 
this energy is negative (dissipation) the stabilization effect of the projections is 
enhanced, making it possible to adopt larger integration time steps or allowing 
longer term computations. 

On the other hand, a positive energy spoils the stabilization effect introduced 
by the projections, resulting in an unrealistic motion and eventually a failure of 
the computations. 

• The consequence of the previous statement is that not all positive definite matri- 
ces are suited to performing a useful projection. Some positive artificial energy 
may be introduced into the system, compromising the stability of the numerical 
scheme. The numerical experiment presented in this paper, despite its simplicity, 
shows this effect very clearly. 
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• For a system with a mass matrix M, subject to a holonomic constraint function 
vector $(q), a velocity projection does not increase the energy of the system if 
the symmetric part of the matrix MA~^$q$q is positive semidefinite, A being 
the projection matrix. 

This property provides a practical criterion for the selection of a projection 
matrix, which is an important issue that is not explicitly discussed in the literature 
in this field. 

• Additionally, if the symmetric part of the matrix MA~^ is definite, the projection 
of incompatible velocities always introduces energy dissipation. 

• Finally, the energy balance of the velocity projection provides an alternative jus- 
tification for the positive performance of the mass-orthogonal projection reported 
in the literature. 
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A General Purpose Algorithm for Optimal 
Trajectory Planning of Closed Loop Multibody 
Systems 

Makoto Iwamura, Peter Eberhard, Werner Schiehlen, and Robert Seifried 



Abstract This paper discusses the optimal trajectory planning problem of multi- 
body systems. The aim of this study is to develop a general purpose optimal 
trajectory planning algorithm to be applied to arbitrary multibody systems. Multi- 
body systems may be divided into two groups, i.e. open loop systems and closed 
loop systems [8]. In [1 1] an optimal trajectory planning algorithm for open loop sys- 
tems was presented. In this paper, optimal trajectory planning algorithms for closed 
loop systems are proposed by extending the algorithm for open loop systems. Two 
types of methods are presented based on the dynamic analysis by computational 
algorithms for closed loop systems. The first method uses generalized coordinate 
partitioning and embedding techniques. The second method is based on an aug- 
mented formulation with Lagrange multipliers. The first method is easily applicable 
to non-redundant actuation systems, while the second method considers redundant 
actuation. The validity of these methods for optimal trajectory planning is confirmed 
by computational results and their features are compared. 



1 Introduction 

Optimal trajectory planning of multibody systems deals with the problem to find 
a trajectory between a specified initial and final state that minimizes a given 
cost function. This topic has been studied by many researchers as an impor- 
tant problem especially in the fields of robotics and space engineering, see e.g. 
[1,3,4,7,9, 10, 13,20]. However, most of the previous papers discuss the problem of 
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Fig. 1 Topology of 
multibody systems, 

(a) open chain system, 

(b) tree-structured system, 

(c) closed loop system 





chain 



a specific multibody system, such as a manipulator, space robot, biped robot, etc., 
while a general purpose optimal trajectory planning algorithm which can be applied 
to arbitrary multibody systems is missing. Hence, this study considers the devel- 
opment of a general purpose optimal trajectory planning algorithm for multibody 
systems by extending the dynamics formalisms for multibody systems. The goal of 
this study is to develop a software that can conduct optimal trajectory calculation 
easily by inputting only the kinematical and dynamical parameters, and the cost 
function. 

Multibody systems may be divided into two groups, i.e. open loop systems and 
closed loop systems, see Fig. 1 and [8]. In [11], an optimal trajectory planning al- 
gorithm for open loop systems, as shown in Fig. la,b, was presented. In this study, 
a general purpose optimal trajectory planning algorithm is established by extending 
the algorithm to closed loop systems, as show in Fig. Ic. 

For solving the optimal trajectory planning problem, two different approaches 
are commonly used, i.e. the exact method based on the minimum principle, see e.g. 
[1,3,10,13] and the approximate method using a Ritz method, see e.g. [4,7,9,20]. In 
this study, the former approach is considered. The solution of the optimal trajectory 
planning problems based on the minimum principle requires sensitivity analysis of 
the dynamic equations, i.e. the partial derivatives of the state equation with respect 
to the states and the control variables have to be calculated. 

In the modeling of closed loop systems, a method that transforms the closed 
loop systems to open loop tree- structured systems by cutting the loop virtually and 
adding algebraic loop closing conditions [14, 15,21] is well known. In this paper, two 
types of optimal trajectory planning algorithm that use this concept of virtual open 
loop tree-structured systems are proposed. Firstly, an optimal trajectory planning 
algorithm is formulated based on generalized coordinate partitioning and embed- 
ding techniques [8] that derives the minimum number of differential equations of 
motion by eliminating the dependent coordinates. This method is exact in the sense 
that it satisfies the loop closing conditions completely, but it is difficult to apply to 
redundant actuation systems, characterized by more actuators than degrees of 
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freedom. Let us denote the number of actuators included in the closed loop system 
by Ha- Generally, closed loop systems may be designed as «a > n. If «a > n 
the system is called a redundant actuation system [15]. Hence, secondary, another 
optimal trajectory planning algorithm based on an augmented formulation [8] is 
proposed. This method can also be applied to redundant actuation systems but it 
might yield small violations of the loop closing conditions. 



2 Optimal Trajectory Planning Problem of Multibody Systems 

In this section the basics of the two proposed methods are presented. 



2.1 Problem Formulation 1: Minimal Form of Equation 
of Motion 

The equations of motion of multibody systems can be expressed by using the gen- 
eralized coordinates q € R" and the generalized control forces t e i?" as 

M(q)q + c(q,q) = T, (1) 

where M(q) e j^"^" [^ the inertia matrix, c(q, q) e R" is the vector of centrifugal, 
and Coriolis forces, gravity, and any applied forces other than r. Equation (1) can 
be rewritten 

dq 

at 

dq 

-f=h(q,q,T), (3) 

at 

where h : q, q, t i-> q is the function defining the acceleration 

q = h(q, q, T) = M-i (q){T - c(q, q)}. (4) 

We define the state vector as x = [q'^ q ]^ € R^" and the input vector as u = t. 
Then, the state equations can be expressed in the form 

x = f(x,u). (5) 

We assume that the initial and final states are given as x(0) = xq for ? = and 
x(fy) = xy for t = tf. The cost function is chosen as 



f ' F(x,u) 
Jo 



J = / F(x,u)dr. (6) 

/o 
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The problem considered in this paper can be formulated as an optimal control 
problem as follows: Find u(f) by minimizing the cost function (6) subject to the 
initial and final conditions for the system (5). 



2.2 Problem Formulation 2: Augmented Equations 



For closed loop systems, (1) and (5) may not be available directly. Therefore, we 
transform a closed loop system into an open loop tree- structured system by cutting 
the loop at one of its joints, see Fig. 2. The equations of motion of virtual open loop 
tree-structured systems read as 



M{e)0 +c{e,0) = To, 



(7) 



where e R"° is the vector of joint variables, tq e R"° is the vector of driving 
torques/forces, M(^) e ^"oxnojg tjjg inertia matrix, c(^, 6) e R"" is the vector 
of centrifugal, Coriolis and applied forces of the virtual open loop tree- structured 
system. The equations of motion for the original closed loop system can be written 
by adding the loop closing conditions ^ (0) = € R'" between the cut branches to 
(7) as 



M(0)0 +c(0,0) + ^lX. =To, 
<if{0) = 0, 



(8) 
(9) 



where $# = 8^/80 e j^mxno ^^^ A e i?™ is the Lagrange multiphers vector 
representing the constraint reaction forces originating from the cut joints. The closed 
loop system has « = no — m degrees of freedom. Equations (8) and (9) can be 
rewritten as 



d0 
d7 
d^ 
d7 



= 0. 



= h(»,».To), 



(10) 
(11) 




• actuated joint o unactuated joint x cut joint 
Fig. 2 (a) A closed loop system, and (b) its transfonnation to a virtual tree-structured system 
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where h:^,^.Toi-^^isa function which is derived in Sect. 4. We define the state 
vector as X = [^ ^ ]^ and 
can be expressed in the form 



vector as X = [^ ^ ]^ and the input vector as u = Tq. Then, the state equations 



x = f(x,u). (12) 

We assume that the initial and final states are given as x(0) = Xq for t = and 
x(fy) = Xf for t = tf. The cost function is 

J = f ' T(x.u)dt. (13) 



Then, the problem considered in this paper can also be stated as follows: Find u(?) 
by minimizing the cost function (13) subject to the initial and final conditions for 

the system (12). 



2.3 Solution Procedure for Optimal Trajectory 
Planning Problems 

The minimum principle, see [18], is well-known as a mathematical tool for solving 
optimal control problems. In the following, we consider the case of formulation 1, 
but the solution procedure for the case of formulation 2 is the same. Let us introduce 
a vector ^ e R^" called adjoint vector. Then, the Hamiltonian is defined by using 
the adjoint vector as 

H(f,x.u) = F(x,u) + f^{{x,u). (14) 

The behavior of x(f ) and ^ (f ) is determined by the canonical equations of Hamilton 

dH 
x= — =l (15) 

dyr 

dH dF rdt 

f = -^ = -^-fW^ (16) 

dx ax ax 

where (15) is equivalent to the state equation (5), and (16) is called the adjoint 
equation. The minimum principle [18] states that the necessary condition for u(f) 
to be optimal is that there exists a nonzero vector ^ that satisfies (15), (16), and the 
Hamiltonian (14) is minimized for all times t. 

The solution procedure based on the minimum principle is as follows. At first, 
the form of optimal control u"^' that minimizes the Hamiltonian (14) is derived 
as a function of x and ^ . Then, u°'"(x, ^) is substituted into (15) and (16). After 
that the differential equations (15) and (16) are solved for x, ^ under the two-point 
boundary conditions. Finally, the optimal control can be obtained by substituting 
X, ^ into u'''"(x, ■f). Since the two-point boundary value problem is difficult to solve 
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analytically, a numerical algorithm is usually used. Moreover, in the case that the 
analytical form of the optimal control u"'" can not be obtained, one should solve the 
two-point boundary value problem iteratively by improving u as u^'"*"^-* = u*^'-* 
8u^'' so that H decreases, based on the information from the gradient 



+ 



= — + f^ — . 

9u 9u 9u 



(17) 



The simplest method is to choose Su = —dH/du but there are many other more 
sophisticated algorithms to compute Su, see e.g. [5, 12, 19]. 

A typical computational procedure for optimal trajectory planning problems is 
shown in Fig. 3. The solution of the optimal trajectory planning problem requires a 
forward dynamics computation including the computation of the state equations f 
and a sensitivity analysis of dynamic equations, i.e. computing the partial derivatives 
of the state equations with respect to the state and control variables 9f/3x. 9f/9u. 
One can compute f over time by using existing forward dynamics algorithms. On the 
other hand, a general method for computing these derivatives is not yet available. 
For the case of systems with only few degrees of freedom, it is possible to derive 
the closed-form equations of 9f/9x, 9f/9u. However, it becomes difficult to obtain 



model description and data input 




optimal trajectory 



Fig. 3 Computational procedure for optimal trajectory planning problems 
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the closed-form equations analytically if the degrees of freedom or its complexity 
increases. Although finite difference methods can be used to obtain approximate 
values of the derivatives, its inaccuracy can affect the convergence of the optimiza- 
tion process. In [11], a general and exact method to compute 3f/9x, 3f/9u for open 
loop systems was presented. In this paper, methods to compute 9f/3x, 9f/3u (or 
9f/3x, 9f/9u) for closed loop systems are developed by extending the algorithm for 
open loop systems. The procedure for computing the optimal control u (or u) after 
obtaining f (or f) and the derivatives 9f/9x, df/du (or 9f/9x, ()f/9u) is similar to the 
procedure presented in [11]. 



3 Optimal Trajectory Planning Algorithm Using Coordinate 
Partitioning and Embedding Techniques 

In this section only non-redundant actuation systems, i.e. ita = n, are considered 
and an optimal control algorithm based on problem formulation 1 is developed. 
Since for closed loop systems it is difficult to compute f, 9f/3x, 9f/9u directly, 
we compute these quantities by using the concept of the virtual open loop tree- 
structured system, i.e. (8), (9) and use a coordinate partitioning technique to derive 
the equation of motion with minimal order of form (1). 
Differentiating (9) with respect to time leads to 



4>{0)^'i>e0 = [«&q4)p] 



= 0, (18) 



where the vector of joint variables of the virtual open loop tree- structured system 
G R"o is partitioned by the independent coordinates q & R" and the depen- 
dent coordinates p € R'", and $# e Rf^^no [^ also divided into $q e ji>"^n and 
$p e R'"^'" corresponding to the coordinate partitioning. If there are m indepen- 
dent constraints, we can assume that $p is nonsingular without loss of generality 
because ^g has m independent columns. Thus from (18), one obtains an expression 
for the velocity of the dependent coordinates as 

p = -«&-i4.qq. (19) 

If the independent coordinates q are chosen as a subset of the coordinates of the 
virtual tree-structured system, then can be written in terms of q as 

= Wq, (20) 

where 



W 



E 



(21) 
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is a Jacobian matrix and E is an identity matrix. We can compute $p and $q by 
using the Jacobian computational method for serial link manipulators, see e.g. Orin's 
method [17]. 

Differentiating (20) with respect to time yields 



6i = Wq + (Wq)qq = Wq + Gq^ 



where 



W„ 



O 

^p ^pq^p ^q ^p ^qq 



(22) 



(23) 



is a Hessian matrix and O is a zero matrix. We can compute $pq and $qq by using 
the Hessian computational method for serial link manipulators, see e.g. Nakamura's 
method [16]. 

Substituting (22) into (8) leads to 



M(Wq + Gq2) + c + ^jX = tq. 
Premultiplying this equation by W^ becomes 

W^MWq + W^(MGq2 + c) = W^tq, 
where the principle of virtual work 

W^$[ = [E-(4.-l$q)^] 



q 



= 



(24) 



(25) 



(26) 



is used. By comparing (25) with (1), we can obtain the following relationships 

M(q) = W^MW, 

q) = W^(M( 



c(q,q) = W^(MGq2+c), 



(27) 
(28) 
(29) 



These equations allow to compute the quantities of the original closed loop system 
M, c, T from the quantities of the virtual open loop tree-structured system M, c, tq 
by using W, G. 

Next, we consider the partial derivative of f with respect to x. From (2) and (3), 
9f/3x can be expressed as 



df 



9q 9q 






()q 3q 
()q 3q 


= 


" O E " 

9q 9q 


_ 9q 3q _ 




_ 9q 9q - 



(30) 
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In order to compute df/dx the quantities 9q/3q and 3q/9q must be computed. 
In [11], the following equations are derived for open loop systems 



(31) 
(32) 



This relationship does hold even in the case of closed loop systems since the equa- 
tions of motion (1) have the same form as the one of open loop systems. Hence, the 
problem reduces to the computation of 9T/9q and 9T/9q. By differentiating (29) 
with respect to q and q, it follows 



9q 


-M" 


,dr 


9q 




dq 


9q 


-M~ 


,dr 


9q 




dq 



9t 

9q 
9t 



r 9ro ^ 9W^ 

W — 1 To 



9q 9q 

^,9ro^9W^ 



9q 9q 



W^^W + G^ro, 



To = W^^W. 



(33) 
(34) 



Thereby the definition of the Jacobian matrix 9^/9q = 96 /dq = W is used which 
can be obtained by differentiating (20) with respect to q. These equations allow to 
compute the quantities of the original closed loop system 9T/9q, 9T/9q from the 
quantities of the virtual open loop tree-structured system To, dxo/dO, dto/dO by 
using W, G. 

From (2)-(4), the partial derivative of the state equations f with respect to u can 
be expressed as 





-9q- 


dt 


9t 


d^^ 


9q 




L9t J 


dq 9h 


97 


9t 



o 

9q 
9t 



= — =M- 



(35) 



(36) 



In the following, an algorithm to compute f, 9f/9x, 9f/9u for given x, u, i.e. 
q, q, T is summarized. It should be noticed that the methods to compute the quan- 
tities To, dto/dO, dxo/dO for open loop tree-structured systems are already es- 
tablished [11]. We define ri{0 ,0 .0) as the function that calculates To for given 
0.6,0 by using the inverse dynamics algorithm for open loop tree-structured sys- 
tems. We also define tji^ • 6 ,0) as the function that calculates 9to/9^, dxo/dO for 
given 0,0,0 by using the algorithm proposed in [11] for open loop tree-structured 
systems. 

Algorithm 

1. For a given q, find p that satisfies the loop closing conditions (9) by using the 
Newton-Raphson algorithm, then get = [q^ P^]^ ■ 
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2. Compute $q, $p by using Orin's method [17], and compute $pq, $qq by using 
Nakamura's method [16]. Calculate W, G by (21) and (23), respectively. 

3. Compute 6 corresponding to q by (20). 

4. Computec(e,^) = Ti(e,^,0). 

5. Let e, e R"'-' be the / th unit vector Compute m, = t i (^ , ^ , e,) — c for / = 1 
to no, then calculate the inertia matrix as M(^) = [mi mj ■ ■ -^no]- 

6. Compute M(q), c(q, q) by (27) and (28). 

7. Compute q = M~^ (r - c), then get f by (2) and (3). 

8. Compute 6 by (22). 

9. Compute To = Ti(^,^,^). 

10. Compute (aTo/96>. 3to/9^) = r 2(0, 0,0). 

11. Compute ar/aq, aT/9q by (33) and (34). 

12. Compute 9q/9q, 9q/9q by (31), (32) and compute 9q/9T by (36). 

13. Compute 9f/9x by (30) and compute 9f/9u by (35). 

The proposed method satisfies the loop closing conditions (9) at all times. How- 
ever, this method is difficult to apply to redundant actuation systems, i.e. «« > «, 
since in that case such a simple relationship as (29) does not hold anymore. 



4 Optimal Trajectory Planning Algorithm Based 
on an Augmented Formulation 

In this section, an optimal trajectory planning algorithm that can be applied to the 
redundant actuation systems is formulated based on the problem formulation 2. The 
values of f, 9f/9x, 9f/9u required for optimal control calculation are computed di- 
rectly from the differential-algebraic equations (8) and (9). 
Differentiating (9) twice with respect to time yields 



i^(e) = <^e0 +i'teO)eO ='te0 +'^eeO = 0. 
Equations (8) and (37) can be combined in matrix form as 



where 



<i>e O _ 


"0' 


+ 


'c{0,0)- 

.y(o,0)_ 


= 


To 





Yi0,0) = 'i>e»0 



(37) 



(38) 



(39) 



The vector of accelerations and Lagrange multipliers can be obtained by solving 

(38) as 

'lil r ivf ihT 1~ Ft n' 

(40) 

From the upper part of (40), the function h ; .0 ,ro i-^ in (11) can be defined as 
^ =h(e,^.To)=Mii(To-c)-Mi2j/, (41) 



"0' 




"M <&r 


-1 


"to-c" 


>. 




.^e O 




. -y . 
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where we defined M,- ,■ as 



M11M12 




M 4.^ 




M21 M22 . 




<^g 






= 


M \e+$[M22^#M ^) -M ^$JM22 

Mf2 -($»M '^ir\ 



(42) 



Next, we consider the partial derivative of f with respect to x. From (10) and (11), 
df/dx can be expressed as 



3f 



'90 dO' 










" E ' 


de 9^ 


— 


do do 


de do 






-do Jo - 




-do gj. 



(43) 



In order to compute 9f/9x, the calculation of dO /dO and dO /dO is necessary. It is 
noticed that the following relationship can be proven 



dO ^^ 9to 

do do 



(44) 
(45) 



where Mil is defined in (42), i.e. Mil =M {E-$[($eM $^)-i$eM }. 

Proof of Equations (44) and (45) 
Equation (38) can be rewritten as 



M{0)0 +c(0,0) = T, 



^ 


"0' 


^ 


"M <&r 




c 




To 


= 




, M = 


. c^ = 




, ^ = 






A. 




'be 




.y . 








where 



Then, (40) can be expressed as 

=M(Or\r-c(0,0)}. 



(46) 



(47) 



(48) 



Since (46) and (48) have the same form as the equations of motion of open loop sys- 
tems, we can derive the following equations by using the same manner as presented 
in [11] 
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do 
do 
do 

9^ 



-M~ 



-M- 



do 

J 9t 
9^' 



From (42) and (47), (49) can be written as 



3^ 

do 

dX. 

Jo 



M <^l 
^e O 



9to 
O 



Mil Mi2 
M21 M22 



9to 
O 



(49) 
(50) 



(51) 



The upper part of (51) is equal to (44). In the same way, (45) can be derived from 
(50). 

In order to solve the optimal trajectory planning problem the partial derivative of 
the state equations f with respect to u is necessary. This can be expressed from (10), 
(11), and (41) as 





- do - 


di 


3to 


3n ~ 


do 




- dxo - 


do 


3h 



o 

do 

- 9to _ 



9to 9to 



= M 



11- 



(52) 



(53) 



In the following, an algorithm to compute f, 9f/9x, 9f/9u for given x, u, i.e. 
,0 .To is summarized. As in the previous section, we use the function ri{0 ,0 .0) 
that calculates tq for given ^ , ^ , ^ by using the inverse dynamics algorithm for open 
loop tree-structured systems, and the function r2{0 , 0,0) that calculates dxo/dO, 
dxo/dO for given 0.0,0 by using the algorithm proposed in [11] for open loop 
tree- structured systems. 



Algorithm 



and compute f^gg by using 



1. Compute ^g by using Orin's method [17], 
Nakamura's method [16]. 

2. Compute y(6l.^) by (39). 

3. Compute c(», ^) = Ti(6l.^,0). 

4. Define e,; e i?"" as the / th unit vector. Compute m, = t 1 (^ , ^ , e,) — c for i = 1 
to no, then calculate the inertia matrix as M(^) = [iiTi iri2 • • -ninQ]. 

5. Compute Mn by (42). 

6. Compute by solving (38), then get f by (10) and (1 1). 

7. Compute (9to/9», 9to/9^) = T2{0,0,0). 

8. Compute 96/ /9e, 9^/9^ by (44), (45) andcompute 96i/9To by (53). 

9. Compute 9f/9x by (43) and compute 9f/9u by (52). 
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The method presented here can also be applied to redundant actuation systems 
without any problems. However, it suffers potentially from so called constraint 
violation since it uses the constraint formulation on acceleration level given by 
(37) instead of the constraint equation on position level given by (9). Though it 
is not proven mathematically, this method may give accurate results in most prac- 
tical applications by combining it with a constraint stabilization technique, e.g. 
Baumgarte's method [2]. 



5 Numerical Examples 

5.1 Non-redundant Actuation System 

In this section, the optimal trajectory planning algorithm using the coordinate par- 
titioning and embedding techniques from Sect. 3 is demonstrated for computing the 
minimum energy trajectory of a closed loop robot manipulator as shown in Fig. 4. 
This system is a non-redundant actuation system, featuring 3 degree of freedom 
and 3 actuators. We transform the closed loop system to a virtual open loop tree- 
structured system by cutting at the joint marked with x in Fig. 4. The kinematical 
parameters (Denavit-Hartenberg parameters [6]) and the dynamical parameters are 
summarized in Tables 1 and 2, respectively. The vector of generalized coordinates q 
and the vector of joint variables of the virtual open loop tree-structured system are 



q=[qiq2q3f = [616293]^ , 
= [9i 02 Ot, 64 65] . 



(54) 
(55) 



(500mm) 



'4 

(800mm) 



(800mm 




Fig. 4 A closed loop robot 
manipulator, see also Fig. 7 
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Table 1 Kinematical 
parameters of the closed loop 
robot manipulator 



Link 


Length 


Twist 


Offset 


Angle 


i 


a, 


ff, 


d, 


0, 


1 





0° 





0, 


2 





90° 





01 


3 





90° 





03 


4 


h 


0° 





04 


5 


h 


0° 





01 



Table 2 Dynamical parameters of the closed loop robot manipulator 



Link 



Mass (kg) Moment of inertia (kg-m^) 
m,- 'i, = diag[/,-, /,,,/-] 



Center of gravity (m) 



1 


10.0 


(1.4100, 1.4100, 1.4100) 


(0.00, 0.00, -0.25) 


2 


5.4 


(0.0023, 0.2891, 0.2891) 


(0.40, 0.00, 0.00) 


3 


3.375 


(0.0014,0.0710,0.0710) 


(0.25, 0.00, 0.00) 


4 


5.4 


(0.0023, 0.2891, 0.2891) 


(0.40, 0.00, 0.00) 


5 


9.0 


(0.0037, 1.2376, 1.2376) 


(0.65, 0.00, 0.00) 



For this problem, we can find the following simple relationships by geometric 
inspection 



Oa = qi -q-i, 

05 = -jT -q2 + q3. 



(56) 
(57) 



Hence, the Jacobian matrix and the Hessian matrix become 



W = 



The cost function is defined here as control energy 



1 











1 











1 





1 


-1 





-1 


1 



o. 



(58) 



/ = 



I u u 

Jo 



dt. 



(59) 



where u = [ti ta r^]'^ . As an example, we consider the problem to find a trajectory 
from the initial configuration q(0) = [0° 30° 180°]^ to the desired configuration 
q{tf) = [60° 90° 240°]^ and vanishing velocity at both ends i.e. q(0) = q(r/) = 0. 
Since the optimal solution depends on the final time tf , we solve the problem for 
different final times tf. Figure 5 shows the minimal value of /, i.e. the energy, for 
different final times tf . 
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Fig. 5 Minimum energy for different final times t f 
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time = 0.00 s time = 0.21s time = 0.43 s time = 0.64 s 
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goal 



time = 0.86 s time = 1.07 s time = 1.29 s time = 1.50 s 

Fig. 6 Optimal trajectory of the closed loop robot manipulator (type 1 , ? /■ = 1.5 s) 



For this problem, two types of trajectories that satisfy the necessary condition for 
optimality are found. The type 1 trajectory is the one that connects the initial and 
final configuration linearly as shown in Fig. 6. The type 2 trajectory is the one in 
which the links fall down in the gravity direction at first while their arms are folding 
up, then fling up to the desired configuration by using the inertial influence as shown 
in Fig. 7. From Fig. 5, it is seen that the minimum energy trajectory switches around 
tf = 2 s, that is, the type 1 trajectory requires least energy if fy < 2 s and the type 
2 trajectory requires least energy if tf > 2 s. In a conventional application, a kind 
of linear motion like the type 1 trajectory is usually used. However, this simulation 
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time = 2.40 s time = 2.60 s time = 2.80 s time = 3.00 s 

Fig. 7 Optimal trajectory of the closed loop robot manipulator (type 2,tf =3.0 s) 



shows the possibihty that we can reduce the energy consumption dramatically by 
selecting a more complex motion like the type 2 trajectory. Figure 8 shows the tra- 
jectory of joint variables for the case of type 2 and f/ = 3 s. 



5.2 Redundant Actuation System 

The optimal trajectory planning algorithm based on the augmented formulation pre- 
sented in Sect. 4 is demonstrated for computing the minimum energy trajectory of a 
planar parallel link manipulator as shown in Fig. 9. This system is a redundant ac- 
tuation system, because it has 3 degrees of freedom and 4 actuators. We transform 
the closed loop system to a virtual open loop tree- structured system by cutting at the 
joint marked with x in Fig. 9. The kinematical and dynamical parameters are given 
in Tables 3 and 4, respectively. 
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Fig. 8 Joint variables versus time (type 2, // =3.0 s) 
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Fig. 9 A planar parallel link manipulator 



In this example the cost function is again defined as control energy 

J = I u^u dt, (60) 



r'f ^ 

= I u udt, 

Jo 



where u = [ti T2 T3 r^]^ . As an example, we consider the problem to find a tra- 
jectory from the initial configuration 6» (0) = [90° 60° 30° - 30° - 60°]^ to the 
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Link 


Length 


Twist 


Offset 


Angle 


/ 


"; 


a, 


di 




0i 


1 


/o/2 


90° 







0i 


2 


h 


0° 







02 


3 


h 


0° 







03 


4 


-lo/2 


90° 







180° + 04 


5 


U 


0° 







05 



Table 4 Dynamical parameters of the planar parallel link manipulator 



Link 



Mass (kg) Moment of inertia (kg-m^) 
m, 'i, ^diag[/,-. /,./,] 



start 



O 



Center of gravity (m) 

's,- = [x.y.zV 



1 


3.0 


(0.05, 0.80, 0.80) 


(0.15,0.00,0.00) 


2 


3.0 


(0.05, 0.80, 0.80) 


(0.15,0.00,0.00) 


3 


3.0 


(0.05, 0.80, 0.80) 


(0.15,0.00,0.00) 


4 


3.0 


(0.05, 0.80, 0.80) 


(0.15,0.00,0.00) 


5 


3.0 


(0.05, 0.80, 0.80) 


(0.15,0.00,0.00) 



o 



c^ 



o 



time = 0.00 s time = 0.21s time = 0.43 s time = 0.64 s 



o 



o 



goal ^^ 



time = 0.86 s time = 1.07 s time = 1.29 s time = 1.50 s 

Fig. 10 Optimal trajectory of the planar parallel link manipulator {tf = 1.5 s) 



final configuration e(f/) = [30° 60° 90° - 90° - 60°]^ and ^(0) = e{tf) = 0. 
The final time is fixed and defined as tf = 1.5 s. Baumgarte's method [2] is used 
to stabilize the loop closing conditions. The obtained optimal trajectory is shown in 
Fig. 10. Figure II shows the trajectory of joint variables. Figure 12 shows the error 
norm of the loop closing condition, i.e. ||^||. As one can see, the error is at most 
10~', therefore, the method is considered here as accurate enough for practical use. 
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Fig. 11 Joint variables versus time (t f = 1.5 s) 
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Fig. 12 Constraint violation versus time (tf = 1.5 s) 
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In this paper, a general purpose optimal trajectory planning algorithm for multi- 
body systems is proposed. Two methods for closed loop systems were developed 
by extending the algorithm for open loop systems. The algorithm using coordinate 
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partitioning and embedding techniques is exact in the sense that it satisfies the loop 
closing conditions at all times but it is difficult to apply to redundant actuation sys- 
tems. On the other hand, the algorithm based on the augmented formulation can be 
applied to the redundant actuation systems quite easily but it can not be guaranteed 
that the loop closing conditions are always satisfied. Hence, one should check the 
results obtained by the second algorithm properly. These algorithms are considered 
to be useful since they allow to conduct optimal trajectory calculation for arbitrary 
closed loop multibody systems from the kinematical and dynamical parameters, and 
the cost function. The efficiency of the methods is demonstrated using two closed 
loop robot manipulators. 

Acknowledgements The authors would like to thank Prof. N. Shimizu (Iwaki Meisei University) 
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Real-Time Simulation of Extended Vehicle 
Drivetrain Dynamics 



Ralf U. Pfau and Thomas Schaden 



Abstract For the virtual engine development, testing and calibration, it is 
advantageous to use the same physical model on different platforms. Due to the 
complexity of the model and its evaluation one has to coop with serve evaluation 
restrictions on the realtime platform. For coupled problems which includes an elec- 
trical system, the equilibrium conditions include an algebraic constraints. Hence 
it is not sufficient to use only an explicit time integration scheme. We extend an 
explicit scheme to a mixed scheme such that the overall performance per time step 
still is below the timing constraint of the realtime platform for reasonable complex 
model with electrical system. 



1 Introduction 

Driven by increasing computing power and by increasing time pressure for shorter 
product cycles more and more of the construction process for new cars is accom- 
panied by and interwoven with computer simulation. There are different tools and 
tool chains for the construction, simulation and evaluation of designs and systems. 
Though the trend is to couple the tools and simulations for a fully virtual engine, 
powertrain and vehicle simulation which allows to take also multi-physical aspects 
and interactions into consideration. Though, increasing computing power and avail- 
able memory supports this, analysis and refinements for the coupled problems are 
needed to take fully advantage of the coupled problem setting and to accurately and 
efficiently solve the coupled problem. 

Additionally, as all modelling and simulation inhabits some compromise with 
respect to details of the physics and accuracy, one also wants to have a smooth 
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Requirements 



Acceptance test 



System Design 



Module Design 




Fiealtzation 



Fig. 1 V development scheme (adapted to our situation) 



transition from pure software models to hardware. This allows to start with simple 
models, e.g. with data from more detailed computer simulations, going to refine- 
ments within the models, updating parameters from prototype measurements, but 
also coupling the software model with testbed system and replacing step by step 
the computer model by hardware, e.g. engine, ECU, transmissions. See Fig. 1 for 
a sketch of the iterative approach used in the well known V development model 
(see [22,28,36] for integrated development processes). 

The first simulation in this style started at the end of the eighties and beginning 
of the nineties ([20] and the references therein). Where one used then the testbed 
and measurement for a verification of the computer simulation before going to the 
assembled system. 

It is easier to accomplish realtime simulation with smaller sized systems where 
a single component or vehicle part is modelled. The tendency is then to be more 
detailed in the description of the model to reduce the modelling error as much as 
possible, but also to detect failures within a component. 

On the other hand, to put a complex model describing a full vehicle behaviour 
on realtime system, one often employed reduced model, reduced order models and 
approximation which are parametrized from more detailed off-line simulations or 
measurements. It is easier to achieve realtime with reduced models, but as the 
parametrization usually results from standard working conditions, the accuracy may 
deteriorate if the calculation runs into critical conditions. To verify, if some failure 
or critical situation results from the model or from the actual configuration or data 
under investigation, a smooth transition and verification with similar calculation re- 
sults in the off-line situation is advantageous. Additionally, in the off-line situation, 
the model may be refined and investigated, if a refinement of the model (the simu- 
lation) or the data (change in the hardware, configuration) is needed. 

To use the model in different situations, one can either use equation driven or 
data driven approaches. In case of equation driven approach, one utilizes a model 
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Fig. 2 Equation driven stages for simulation 
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Fig. 3 Data driven stages 

(fixed equations in ^ Data * BJtBBI9B ► Results 

components) 

Equation; 




based design, automatic code generation, equation reduction with computer algebra 
systems, e.g. based on a Modelica description of the components, see Fig. 2 for the 
stages [8, 10, 24, 33, 38]. Hence the emphasis is on description of the system by 
equations. On the other hand, with a data driven approach, one has a verified and 
tested model with a equation system based on physical modelling. The difference 
between specific configurations is then in the parametrization of the model and com- 
ponents which has to cover sufficiently large domain, see Fig. 3. The challenge then 
is in the development of those data driven simulation environment whereas the us- 
age and utilization in an industrial context is easier compared to an equation driven 
approach. In contrast to the equation driven approach where more knowledge (and 
computer algebra) is used for generating an efficient executable, more has to be done 
for generating an executable for the data driven simulation. On the other hand, one 
can consider specific model characteristics and the equation system for an efficient, 
adapted solution algorithm. As noted in [15] the linear algebra implementation is 
a critical point for the efficiency, even more for realtime simulations. Specialized 
numerical methods have been developed for constrained multibody system and the 
solution of the DAE system with adapted methods utilizing specific approximations 
of the Jacobian, leading to mixed integration methods [1, 6, 7, 1 1, 29, 31, 32]. 

On the testbed system 1 ms simulation time has to be calculated within less then 
1 ms CPU time. Normally one has some soft constraint of 0.5 ms CPU time for the 
simulation of 1 ms simulation time as also the operation system and 10 needs CPU 
time, but a time failure now and then can be coped with as long as some sufficiently 
accurate data is available within 1 ms. On the testbed system either all can be in 
software (SIL = software-in-the-Loop) or some parts in hardware (HIL = hardware 
in the loop), see [8, 24, 28, 34] for experiences and results of other HIL simulations 
and couplings, starting with smaller system and reduced complexity within the vehi- 
cle. But each part on the testbed system communicates with the others via standard 
channels which are similar in a vehicle and each part does not know whether the 
other parts it communicates with is software or hardware (see Fig. 4). With respect 
to system coordination, one has to expect and coop with similar problems and effects 
as for co-simulation coupling of systems. 

In the beginning, it was not possible to run the full multibody system on the 
realtime environment. Therefore model reduction, reduced models and similar 
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Fig. 5 Powertrain example, clip from an AVL CRUISE example model with mechanical 
connections 



approaches have been employed [5, 9, 24, 30]. Also, the complexity of the system 
has been reduced by restriction on relevant subsystems of the vehicle [21, 33, 35] 
and the interaction with the other parts have been approximated. 

Still to have the full vehicle modelled as elastic multibody system with addi- 
tional parts and actors is beyond the capacity of the modem HIL systems. In this 
paper our emphasis is on the powertrain dynamics. That is the interaction and inter- 
play of different components in the car from engine to the tires to yield to overall 
behaviour and different characteristics of the (assembled) car. An accurate and effi- 
cient simulation of the powertrain dynamics is an important tool in the engineering 
application and tuning of control units. To achieve this, the component model has to 
be sufficiently detailed and accurate to include all needed physical effects. 

The program CRUISE is developed by AVL List for the simulation of the 
powertrain dynamics of vehicles with respect to driving performance and fuel 
consumption (Fig. 5). The main array of application has been used the off-line sim- 
ulation of the vehicle to calculate characteristics. It is possible to include hardware 
by measurement and setting the measured curves in the components of the model. 
Some progress has been made recently for the real-time simulation of the mechani- 
cal system [27, 36]. The next step is to include advanced electrical components into 
the driveline. Our focus is on the analyses of and changes for real-time simulation 
of the extended powertrain (or driveline) systems. 

The standard driveline system is a multibody system with some specific 
characteristics as one has mainly rotational DOFs in the system. Besides the 
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mechanical behaviour, the properties of the combined systems is crucially 
influenced by the different controller units which act on the driveline and influ- 
ence the performance, e.g. through gear change, opening/closing of clutches and 
similar. For the development, testing and evaluation of novel powertrain concepts 
additionally electrical components have to be integrated into the simulation to be 
able to simulate and evaluate hybrid vehicle concepts. This is also important from 
the aspect of controller adjustment and parametrization as more and more control 
units are needed for new and hybrid cars whose calibration can efficiently be done 
only in the simulation. 

An example for a successful application of the combined simulation is the AVL 
Turbohybrid [13]. Downsizing and downspeeding is combined with turbo charging, 
direct injection and mild hybridization such that driveability as well as fuel con- 
sumption for given test cycles have been reduced. It has been possible to test and 
adjust the different components to achieve a successful test design which fulfills the 
partly opposite functional requirements [36]. 

In this paper we concentrate on standard office simulation (abbreviated by 
office) which can be utilized for in-depth investigations and on realtime integration 
on testbed systems {realtime). As noted in [15] not only the model, but also actual 
implementation of the numeric and the efficient implementation of the numerical 
linear algebra is critical for the performance. Additionally, for realtime applications 
with an emphasis on larger multibody systems specific time integrators have been 
investigated which reduces to linear implicit problems (see [1, 6]). In our case, the 
interplay between the mechanic and the non-mechanic physics is stronger. 



1.1 Notations 

X, V, a mechanical state with way x, velocity v — x, acceleration a = x 

m moments in the system. 

z (continuous) control variable, e.g. load signal to the engine 

d discrete value in the system, e.g. gear, slip-stick state 

V voltage in the electrical system 

i current in the component 

t time of the system, derivative with respect to time is denoted by ' 

M (diagonal) mass matrix 

F (external, internal) forces, torques in the system 



2 Powertrain System and Vehicle Dynamics 

Although the emphasis is on the powertrain system, components which have an ma- 
jor effect on the driveline and driving performance are modelled internally more in 
details to cover more effects. Other components may include more simplifications. 
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Fig. 6 Component structure. The description data may determine the internal complexity of 
the component and includes the parameterization of the component, e.g. physical constants, 
measurements 




Fig. 7 Model for realtime system with marked domains for mechanic and electrical system 

The actual behaviour of the component is then described by physical constants, 
measurements and/or simulations. 

The vehicle is modelled as blocks, see Fig. 6 which is similar as in Mat- 
lab/Simulink or Modelica based systems. Each component can be described as 
an input-output system with different connections and description data, see Fig. 6 
and [2]. More details can be found in [2, 19] or more generally in [37, 39]. The 
combination of the blocks and the connections yields then the vehicle (see Fig. 7) 
and an equation system for the vehicle which may be an ODE or DAE. 

The internal structure of the components may vary from table interpolation to 
complex physical models or approximation and simplified models. The parametriza- 
tion may come from the model, physical constants, measurements, other calcula- 
tions or parameter identifications. The components may include state events and 
switches, e.g. of the gear, slip-stick phenomena and similar. 



2.1 Multibody System 



The mechanical system is mainly described through rotational degrees of freedom 
and leads to an equation system of the form F = Ma with the diagonal mass 
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matrix M. The left hand side forces F are calculated by the components and may 
depend on all of the state (x, v, controls, voltage, current, . . . )■ Additionally, the mo- 
ments between the components within the system are added are added as unknowns 
to be able to specifying moment which depend on the state [19]. 
This leads to the equation system for the mechanical part 



iciycxi) 



(1) 



where F includes now the moments calculated in the components which acts on 
the mechanic as well as the losses. M is the diagonal mass matrix, B describes the 
acting of the moments between the components on the components, C describes the 
constraints between the mechanical parts and D the relation between the moments. 
The coupling between the components can either by via constraints (in the C part) 
or via moments. Due to the character of the connections, the last rows split into 
Ca = and Dm = F. 

Remark. 

• The equation Ca = is also utilized to reduce the number of degrees of freedom 
in the system from the total number of DOFs to a reduced number of independent 
DOFs for the time integration. 

• Changes in the system, e.g. gear change, opening and closing of connections, 
change the matrix parts C and D , but not M and D . 

For the standard time integration the calculation of the inverse of 

(2) 



[c d) 



has to be calculated (and stored) after each state event/change in the matrices. This 
is not critical for the normal time integration. 

But under realtime conditions, the inversion of S is time critical. Therefore a 
special matrix decomposition based on Schur complement is used. Instead of calcu- 
lation and usage of S~^ the following is used: 

{D-CM-^B)m = F (3) 

a = -M'^Bm. (4) 

As the matrix M is diagonal, the inversion is cheap. The decomposition and 
variable management can be calculated beforehand. Also, a sparsity pattern for B is 
calculated and used for fast matrix-vector multiplication. Changes in the system now 
influence S := (^D — CM~^ B). The inverse of S now has to be calculated/updated 
after each state event. But this results in an order of magnitude smaller CPU time 
compared to the inversion of S. 
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2.2 Control Units 

Besides mechanical properties like inertia, ratios and losses, the overall behaviour of 
the vehicle is determined by the control units acting on the driveline and changing it. 
The tuning and calibration of the controls is one of the major tasks of the simulation. 

There are two kinds of control units in the system: continuous controls and dis- 
crete controls. 

Besides already available controllers in CRUISE, additionally, the develop- 
ment and parametrization of new controllers can be done with Matlab/Simulink. 
Additionally, Matlab/Simulink allows the generation the generation of DLLs for 
coupled simulation and code generation for ECUs (see e.g. the homepage of Mat- 
lab/Simulink and code generation with realtime toolbox.' 

Continuous control units can be either a control which depends directly on input 
values, e.g. z = h{x,v,a) or which can be described by a differential equation 
z' = /(x, V, a, z), e.g. from a PI control. 

Besides continuous states, there are also discrete variables, states and control 
units in the model of the vehicle, e.g. gear or slip-stick condition. Changes in a 
discrete variable causes a state event and a change in the system matrix S . Controls 
which change the discrete variables can be described by inequality constraints of 
the form 

g{d,x,v,a) <0. (5) 

Additionally, other changes in discrete variables, e.g. from slip-stick, can be de- 
scribed by inequality constraints, too. For more on this hybrid systems see [18,23] 
for more general approaches or [25] for specialized approach in vehicle dynamics. 



2.3 Electrical System 

Similar as for the mechanical system, also the electrical system in the vehicle is 
specialized. Components modelled are battery, generator and electrical engine, each 
with a detailed physical model (see [2]). There is no inductivity in the system and 
all hybrid parts are connected in parallel. 

In our situation, the electrical components calculate their current if depending on 
the voltage level u in the subsystem. Taking now the voltage as additional variable, 
this leads to the additional equation for the electrical system per net 



Z!'v(«) 



(6) 

in which the sum goes over all electrical components in the net. 



' MathWorks homepage http://www.mathworks.com. 
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Remark. Due to the specific models, the electrical system consists of resistors and 
capacities. Hence differentiation of (6) gives terms of the form R ■ u and C ■ u with 
resistances R and capacities C which may depend on mechanical state, i , u, and 
additional system states like temperature. The once derived equation can be brought 
into the form ii = right hand side. Such that we have an index of one. 



2.4 Coupled System 

Putting all equations together, we get a coupled system with multibody part (second 
order ODE), control equations (first order part) and constraints (equality and in- 
equality). Additionally, the variables in the system are either continuous or discrete. 
For calculating a stable solution, it is assumed that the discrete variables change 
their values only if an inequality constraint becomes active. Trivial connections can 
be substituted to reduce the number of unknowns, Ca = can be used to reduce 
the integration to the independent DOFs. 



(c^)'C:) 



I (7) 

F(x,x, u,z, d, i) I 



z = f(x,x,u,z,d,i) (8) 

> g(x,x,u,z,d,i) (9) 

J2ijix,x,u) = (10) 



2.5 Analysis Tasks 

The focus is on the simulation and time integration of the system (7)-(10). Addition- 
ally needed boundary conditions, like initial values, environment values are given by 
the task simulation. The same vehicle equation system (7)-(10) is utilized on differ- 
ent platform (office, office RT, realtime/tesbed). This allows on the specific platform 
a detailed analysis, development and validation and verification of the calculation. 
Data can be collected on the realtime platform and used for parametrization on the 
office system. On the office system additional stationary analysis are possible for a 
more in depth analysis of the vehicle with additionally stationary tasks like constant 
driving conditions, inclination abilities. 

Using the same physical model on the different platforms allows a detailed anal- 
ysis of specific effects on the platforms (see Fig. 8). It is possible to test if effects 
occur due to the change from implicit to explicit time integration, replacement of 
software model by hardware (ECU controller), if the parametrization is accurate 
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Fig. 9 Clip from an office driving cycle simulation (ECE). Plotted are desired velocity, actual 
velocity and limits 




Fig. 10 Enlarged section from 9 with marked actual velocity 



enough and similar. By this, a smooth transition on the realtime system is possible 
from software and software-in-the-loop models to hardware-in-the-loop. 

In Figs. 9 and 10 a typical standard driving cycle simulation is depicted with a 
desired velocity and limits on the velocity of the vehicle. The driver tries to follow 
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Fig. 11 Corresponding pedal movements for the ECE clip of Fig. 9 



the desired velocity. The driver is a control unit which compares the desired and 
the actual velocity and calculates from this the pedal movements, see Fig. 11. Due 
to the dynamic of the car and the calculated pedal movements the specific fuel and 
electrical consumptions are calculated. In contrast to the fuel, the electrical energy 
can be reclaimed by the generator (with losses). 



3 Time Integration 

For the simulation the system is integrated with respect to time. In the off-line 
situation the total time interval is normally given beforehand, in the realtime case 
the simulation end time normally is open. In both cases the simulation should be 
accurate and efficient. 



3.1 Office Simulation 

In the office or off-line situation, efficiency is reached by targeting large time 
step and using an implicit or semi-implicit integration step (implicit Euler step or 
Burlisch-Stoer method, see [3, 17]). But still, switches and state events in the system 
should be calculated correctly, otherwise oscillations may be triggered. To achieve 
this, an a priori step size adaptation [26] is used in which the state events are de- 
scribed by (the activation of) inequality constraints. 

For the normal simulation the step size length in the range of 10 ms to 100 ms. 



3.2 Realtime Office 



For development and analysis of problems, e.g. from the realtime simulation it 
is also possible to simulate the situation on the testbed system under realtime 
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conditions on the office computer. By this, one can easier detect problem e.g. due to 
the smaller time steps and parametrization, if an effect/oscillation is mainly related 
to the time step. 

Additionally, in the development step one can easier profile the model and de- 
tect, e.g. evaluation limits, problems, by which parts performance is lost, and for 
example, which model complexity cannot be calculated in realtime. 



3.3 Realtime 

In contrast to the office simulation, the most important condition for realtime is that 
each time step must be finished within the given time frame. The normal time step 
length is 1 ms for resolving effects upto 500 Hz (limit from the NyquistShannon 
sampling theorem, see [14], Chap. 8). As noted (see Sect. 1), the numerical integra- 
tion step should be finished within 0.5 ms CPU time. 

Due to the sampling and communication with external parts, fixed time steps 
have to be used. Inner subdivisions of the interval, as for Runge-Kutta steps are 
possible, but the values from external parts are then extrapolated. 

To achieve an efficient time step, as much as possible is put into the setup and 
initialization. Additionally, profiling of the model evaluation on the realtime office 
system is employed to identify time consuming code parts and finding more efficient 
linear algebra implementations additional assumption that St = I ms. 

As timing experiments reveals, one has about time for 10 to 20 system evaluation 
(see Table 2 in Sect. 4.2). It is known, that even for fast automatic differentiation one 
has the limit of five times the system calculation effort for evaluating the sensitivity 
independent of the number of unknowns [12]. Which would leave to few iterations 
for nonlinear system solver. Hence we are restricted to explicit integrators. 

For the system y' = f(t. y) we used (see [16, 17]) 

• Heun 

ki = f{tn,yn) (11) 

hn 
yn + l = yn + -Y^fi^n.yn) + fitn +h„,y„ + h„ ki)) (12) 

and 

• Bogacki-Shapine 

ki = f(tn,y„) (13) 

k2 = f (t„ + -hn,yn + -h„ki] (14) 

k3 = f it„ + -h„,y„ + -h„k2\ (15) 

2 1 4 

yn+i = yn + -h„ki + -hnkz + -h„k-i (16) 
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kA = f{t„,y„+i) (17) 

7 111 

Zn+l = yn + -^Kky + -hnkj + -h„k-i + -hnk^. (18) 

z4 4 3 o 

The Heun method is of order h^ with two function evaluations where as Bogacki- 
Shapine is of order h^ with four evaluations. The additional information about the 
error estimator from Zn+i in (13)-(18) is only used for additional internal subdivi- 
sion of the interval, if possible. In case of Heun, the value h^ \\ki \\ can be used to 
estimate if subdivisions would be advantageous. 

If the model allows, the usage of Bogacki-Shapine is advantageous due to the bet- 
ter order and additionally, if internal subdivisions are done, then k^ corresponds to 
k\ of the next step, such that only three additional function evaluations are needed. 
Due to possible switches and changes in the system at the beginning of the time 
step, the value of the former step cannot be re-used. 

If a switch occurred at the beginning of the time step, no subdivisions is possible 
due to the additional work for the switch. If no switch occurred, at least one more 
may be possible (doing two | steps internally for the powertrain system). So in the 
internal subdivision control, one has to combine a time estimate with some stability 
or accuracy criteria from the formulas. If it this possible to estimate the time of the 
function evaluation / then additionally this can be used to determine the number of 
possible subdivisions. 

Due to the algebraic constraint from the electrical equilibrium condition in 
(7)-(10) a purely explicit time integration is not sufficient for a stable and accu- 
rate solution [4]. In our case, no inductors are present in the electrical system, but 
only resistance and capacities. Hence the index of the constraint is 1 (with respect 
of having the voltage u as algebraic variable). Additionally, the electrical system 
reacts with a much higher speed than the mechanical system. Hence, in decoupling 
one can first solve for the mechanical system and then for the electrical equilibrium 
^ ij{x, V, m) = with fixed mechanical state x and v. As this reduces to a nonlin- 
ear, one-dimensional equation, the solution can be done efficiently by a few step of 
Regula-falsi. 

Additionally, in evaluating the electrical components, the evaluation is reorder 
such that only the electrical parts, (10), needed for the constraint are evaluated. In 
doing so, the cost of the inversion of (10) for u is further reduced and the explicit 
solver sees only the ODE. 

This time integration step can be seen as a mixed discretization in which the 
differential variables are discretized explicitly and the algebraic variables (in our 
case the voltage u) is discretized implicitly. 



4 Application 

In this section we illustrate our approach with a typical industrial example. 
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Besides the standard components like engine, tyres, brakes in the hybrid car the 
model includes additionally a battery, a generator and an electrical engine. The con- 
trol of the engine and generator is given by a Matlab/Simulink controller which is 
transformed into a DLL for the off-line case and transferred to the hardware for the 
realtime case. The communication from the powertrain model to the controller goes 
via the signal bus similar as in the vehicle. Each system does not know whether 
the other is in software or in hardware. For the realtime case the vehicle, tyres are 
simulated by the testbed system. 

In Fig. 12 the office model is depicted on the left and the realtime model on the 
right. The powertrain system is the same for both. In the realtime case the simula- 
tion of the vehicle, tyres and cockpit is done by other parts of the testbed system. 
Moments and rotational state at these points are transferred between powertrain and 
testbed system. 

In the vehicle, an engine, an electrical motor and a generator are connected via 
a planetary gear to the output driveline. For the overall performance of the vehi- 
cle, the interplay of the mechanical and hybrid components is essential. To achieve 
this, the different controls have to be well adjusted for the performance and fuel 
consumption, e.g. with energy recuperation. 



4.2 Simulation Results 

In this section we compare the realtime office calculation with the office calcula- 
tion with respect to numerical performance. As Windows XP is not a realtime OS, 
there are some measure errors and peaks due to OS events. But for the overall per- 
formance the calculations show the same characteristics and performance as on the 
realtime testbed systems. Additional analysis can be included on the office system 
which are not possible on the testbed. 
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Fig. 12 Model for Office System (left) and Realtime System (right) 
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In the realtime system the powertrain is calculated within our part. The calculation 
of the pedal movements, the load, the vehicle manoeuvre is done in the testbed 
system. 

During the driving of a test case, the control splits the load signal as given by the 
driver (here calculated by the driver component) into load signals for the different 
engines and generator. Calculated curves are plotted in Fig. 13. 

In Fig. 14 a clip of the timing is depicted (with state curves of the system). Critical 
is that it is possible to keep here the total CPU usage below 0.5 most of the time, 
though one can see some increase here due to gear changes and driver actions from 
the low average value of 0.2. 

Calculating offline, it is possible to track the number of iterations needed for the 
embedded solution of the constraint. An increase in the number of steps in Fig. 15 
corresponds to fast changes in the electrical systems and results in higher overall 
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Fig. 13 Load signals from the control Unit to the different engines and generators 




Fig. 14 Timing for RT Office Calculation with additional state curves. Timing curves of the pow- 
ertrain and the full system are plotted in the lower part. The total is above the timing for the 
powertrain. The straight line in the middle corresponds approximately the 0.5 line 
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Fig. 15 Number of nonlinear 
solution steps for the 
constraint over time. Around 
t = 75 and t = 86 s, there are 
fast changes of the electrical 
system 
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CPU times for this time step. Nevertheless, as the iteration is done only for a subset 
of the system, the overall performance can be kept within limits. One can see that a 
higher number of iterations in Fig. 15 corresponds to changes in the system (visible 
in the load signals in Fig. 13) which implies higher changes. But as one can see in 
Fig. 14, the additional CPU demand is small. 



4.2.2 Comparison Implicit and Explicit Solution 

For estimating the accuracy of the testbed calculation, we compared the calcula- 
tion with an implicit solver with 1 ms time step on the realtime office system. For 
the nonlinear system solution we used an Quasi-Newton method in which the Jaco- 
bian is kept constant, but recalculated if accuracy and performance criteria fail. The 
needed CPU time for the nonlinear solution is in average double the time as for the 
explicit case. This would be still sufficient for the constraints for the testbed system, 
giving an overall performance of about 0.4 to 0.6. But due to the nonlinear system 
solution, ever 10th or 20th time step (due to Jacobian calculations) the realtime con- 
straint is violated. Although this does not lead to an extreme increase in the average 
CPU time demand, this makes the implicit method only useful for comparison in 
the realtime office case. 

In Fig. 1 6 we plotted the voltage and current curves for the battery for the implicit 
and explicit solution. In calculating the velocity and the way of the vehicle, we have 
a difference in the way of 0.14% after 100 s simulation due to numerical drift. As 
the controls normally depend on the way (e.g. curve, acceleration possible, ...) we 
have plotted the curves in Fig. 16 over the way. Depending on time, we would have 
a shift of approximately 0.2 s. 



4.2.3 Comparison Realtime and Office 



For comparison, the office driving cycle of the Fig. 9 with 195 s simulated time is 
calculated in 3.23 s total time. The configured step size length is 10 ms. The mini- 
mum step length from the step size control was 2 ms and the average 9.8 ms. Hence 
for most of the time steps the maximum step length was possible. 
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Fig. 16 Battery voltages explicit and implicit over way 
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Table 1 Number of function evaluations per time 
step of either 1ms or 10 ms length. Different real- 
time (rt) integrators compared to office simulation 
(with ECE test cycle). BogSha = Bogachi-Shampine, 
B.S. = Burlisch-Stoer 





Average 


Min 


Max 


rt Heun 1 ms 


9.99 


6 


10 


rt BogSha 1 ms 


8 


8 


8 


rt Impl.Euler 1 ms 


16 


4 


24 


office B.S. 10 ms 


23.6 


16 


1,265 


office Impl. Euler 10 ms 


8.4 


4 


145 


office Impl. Euler 1 ms 


5.7 


4 


142 



Table 2 Estimated time 
per system evaluation 



Integrator 



rt Heun 
rt BogSha 
office Heun 



4.2,4 Comparison with Respect to System Evaluation 



Estimate 



1.70238e-002ms 

1.7542e-002ms 

1.70477e-002ms 



office BogSha 1 .72632e-002 ms 



In the office RT environment it is possible to include some additional timers and 
counters to estimate the time needed per system evaluation and the number of eval- 
uations per time step. The measurements for the example are given in the tables (1) 
and (2). 

Due to the complexity of the system evaluation for counting the number of eval- 
uations in table (1) the number of evaluations of the engine component has been 
monitored. For the timing estimate the time in the integrator routines (with all over- 
head) has been measured. Due to the few more operations for Bogachi-Shampine 
integrator the estimate is slightly higher than for Heun. The time for a refresh is 
included, but is not seen that much in the average. 



212 R.U. Pfau and T. Schaden 

From profiling it is known, that the time needed for the update of system matrices 
after an state event is about 0.1 to 0.2 ms. Additionally, the calculation of the 
other parts of the testbed system need about 0.2 ms. Putting this together with the 
tables (1) and (2) gives a good agreement with Fig. 14. 

Also one can estimate, an iterative solver would fail the realtime time constraint 
(with upto 24 evaluations per time step). 



5 Conclusion 

From the model and the mathematical simulation requirements it is a larger step to 
go from offline simulation to real-time. By analysing the model and the numerical 
algorithm, it is possible to simulate extended drivelines under real-time conditions 
with 1 ms step size as needed for realistic application. This allows a smooth tran- 
sition with respect to platform, but also with respect to software-in-the-loop model 
and hardware-in-the-loop model which is advantageous in the engineering develop- 
ment process allowing a smooth transition from office to testbed calculations and an 
integrated approach in the development for testing and verification. 
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Assessment of Antagonistic Muscle Forces 
During Forearm Flexion/Extension 

Maxime Raison, Christine Detrembleur, Paul Fisette, and Jean-Claude Samin 



Abstract Today, the accurate assessment of muscle forces performed by the human 
body in motion is still expected for many clinical applications and studies. How- 
ever, as most of the joints are overactuated by several muscles, any non-invasive 
muscle force quantification needs to solve a redundancy problem. Consequently, 
the aim of this study is to propose a non-invasive method to assess muscle forces in 
the human body during motion, using a multibody model-based optimization pro- 
cess that attempts to solve the agonistic and antagonistic muscle overactuation. The 
main originality of the proposed method is the cautious using of Electromyographic 
(EMG) data information, known by all to be noisy-corrupted, via a protocol divided 
into two main steps: 

1 . Muscle force static calibration. 

2. Muscle force dynamical quantification. 

In this chapter, the process is applied to a benchmark case: the force quantification 
of the elbow flexor and extensor muscle sets of subjects engaged in weightlifting 
and performing cycles of forearm flexion/extension. A statistical validation of this 
method shows a good inter-test reproducibility and a very good correlation between 
a. the net joint torques resulting from the obtained muscle forces and b. the net joint 
torques given by inverse dynamics. 

Consequently, since the method is able to consider measured information on the 
actual muscle activation, it becomes a promising alternative to methods based on 
preset strategies, usually presented in literature, such as the strategy that maximizes 
endurance defined by Crowninshield et al. 



M. Raison (El), P. Fisette, and J.-C. Samin 

Center for Research in Mechatronics (CEREM), Ecole Polytechnique de Louvain - Universite 

catholique de Louvain, Batiment Stevin, Place du Levant 2, 1348 Louvain-la-Neuve, Belgium 

e-mail: maxime.raison@ucIouvain.be; paul.fisette@ucIouvain.be; jc.samin@uclouvain.be 

http://www.cerem.be 

C. Detrembleur 

Rehabilitation and Physical Medicine Unit (READ), Universite catholique de Louvain, 

Tour Pasteur, Avenue Mounier 53, 1200 Bruxelles, Belgium 

e-mail: christine.detrembleur@uclouvain.be http://www.uclouvain.be/en-read.html 



K. Arczewski et al. (eds.), Multibody Dynamics: Computational Methods 215 

and Applications, Computational Methods in Applied Sciences 23, 

DOI 10.1007/978-90-481-9971-6_ll, © Springer Science-FBusiness MediaB.V. 2011 



216 

1 Introduction 
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Today, the accurate assessment of internal efforts, and particularly the muscle forces 
performed by the human body in motion, is still expected for applications in many 
fields, like: 

• Rehabilitation: for the evaluation and follow-up of patients with musculo-skeletal 
pathologies, e.g. using gait analysis for hemiparetic [16] or scoliotic [30] patients. 

• Ergonomics: for instance, for comfort analysis of vehicle drivers [28, 37] or dur- 
ing vehicle accessibility motion [14]. 

• Prevention: in order to avoid the risks of wounds and the appearance of patholo- 
gies associated with motions, e.g. during maximal pushing efforts [6, 33]. 

• Sports: in order to analyse and improve athletic performances, e.g. during pedal- 
ing [22] or somersault on trampoline [4]. 

However, as most of the joints are overactuated by several muscles (2.6 muscles 
in average per degree offreedom (DOF) in the human body [1]), any non-invasive 
muscle force quantification needs to solve a redundancy problem. 



1.1 Redundancy Problem Formulation 

Let us consider, for instance, the flexion/extension of the human elbow. As rep- 
resented in Fig. 1, this system is composed of three body members, the arm, the 
forearm and the hand, which are articulated around joints, the shoulder, the elbow 
and the wrist, respectively. Particularly, the elbow joint flexion/extension is actuated 
by: 

• The flexor (or agonistic) muscle set, mainly composed of the longus and brevis 
biceps brachii. 



Scapula 



Fig. 1 Illustration of the 
human elbow, including a 
representation of the flexor 
muscle sets (biceps brachii) 
and extensor muscle sets 
(triceps brachii) that 
overactuate the elbow joint 



Humerus 




Forearm 



Hand 



Cubitus 
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• The extensor (or antagonistic) muscle set, mainly composed of the longus and 
lateral triceps brachii. 

Consequently in this example, we can state that on the one hand, pure flex- 
ion/extension corresponds to one rotational DOF of the joint elbow, and that on 
the other hand, there is one flexor muscle set and one extensor muscle set, which 
together overactuate the elbow joint. 

In a dynamical context, the link between the individual muscle forces produced 
about the joint and the resulting moments can be also stated via the Principle of 
Potential Power, as follows: 

N 

MijMwij =^F„,,-/lv,- (1) 

where: 

• M^: is the resultant moment at the joint, i.e. at the elbow, in our example of 
flexion/extension . 

• Ffnj is the force vector performed by the i th muscle set, for i = I, .... N muscle 
sets; here this corresponds to the flexor set (i = 1) and extensor set (i = 2). 

• Acoe is the potential variation of the angular velocity vector at the elbow during 
flexion/extension. 

• A\i is the potential variation of the translation velocity vector of the ;' th muscle 
set insertion points, for / = 1, . . . , A^ muscle sets; here this corresponds to the 
flexor set (;' = 1) and extensor set (;' = 2) ; both Acoe and Z\v, must be chosen 
compatible with the kinematic description of the motion [42] . 

Practically, as vectors Ame and Ay, can be determined using kinematic measure- 
ments and anatomical information, and as the joint resultant moment Te can be 
determined using inverse dynamics, the only unknown variables in (1) are the mus- 
cle forces, Fm,/ . As the motion of forearm pure flexion/extension is here considered 
as having one DOF, equation (1) is a mathematically redundant system that has an 
infinite number of possible solutions for the muscle force repartition between Fm,i 
andFm,2- 



1.2 Classification of Solving Methods 

Among the methods to solve the undetermined problem of the muscle redundancy, 
most frequently used methods have been based on optimization function, which 
attempted to model the muscle co-contraction strategy during motion. Considering 
the cost function, many ones have been proposed in literature, on the basis of criteria 
that minimized the metabolic cost [41], the sum of forces [35], a weighted sum 
of forces [27], or the fatigue [20]. Particularly, a significant advance in the the- 
oretical prediction of individual muscle forces was made by Crowninshield and 
Brand [13], who insisted on the importance of an objective function based on 
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experimental tests (rather than an arbitrary or mathematically convenient one). 
These investigators used a nonlinear optimal design formulation. The objective 
function was based on the idea that during many cyclic, activities with low-level 
control (such as normal walking during steady state), muscles are recruited in such 
a way as to maximize endurance (i.e. maximize the duration a motion can be sus- 
tained). Concretely, they formalized the endurance maximization by minimizing the 
following objective function [13]: 






N 

(2) 






where: 



• Fm,i [N] = force magnitude of the i\h muscle. 

• PCSA, [m^] = physiological cross-sectional area of the i\h muscle. 

• N [-] = total number of muscles considered. 

• n {-] = power, usually set equal to 2 or 3 in order to best fit the experimentally 
measured patterns of forces using electromyographic data [21]. 

The associated inequality constraints are: 

f4 >0, for/ = l,...,iV (3) 

and equality constraints are given by (1). 

Today, most of the methods of muscle force quantification are strictly based on 
Crowninshield cost function, although these authors insisted on the fact that their 
method was simply an approximation. For instance, they showed that, during gait, 
the estimated muscle forces were only close to the EMG patterns during the swing 
phase, and were significantly different during the other phases. As conclusion of 
their discussion, the authors of this method attempt to insist on the importance of 
using muscle activity information in order to improve the quality of results obtained 
by any optimization process. Indeed, this reference points out the fact that the pro- 
posed methods provide more physiologically realistic muscle force results than the 
classical methods, because the results are more consistent with both EMG and phys- 
iological constraints under dynamical conditions. However, it is known that EMG 
is reasonably proportional to the muscle force in static conditions [5], while the 
"EMG - force" relation is not linear in dynamical conditions and has not yet been 
reliably established [21]. 



1.3 Objective of this Study 

In this context, the aim of the present study is the development of a non-invasive 
method to quantify the muscle efforts of the human body in motion, on the basis of 
a model-based optimization that attempts to solve the muscle redundancy problem. 
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including a cautious use of EMG data. The process is applied to a benchmark case: 
the force quantification of the elbow flexor and extensor muscle sets of subjects 
engaged in weightlifting and performing cycles of forearm flexion/extension. 



2 Material and Methods 

This section will describe the principle of our method, the experimental set-up, the 
model and hypotheses, and finally the process of the muscle force quantification. 



2.1 Principle 

The principle of our method, schematically outlined by Fig. 2, is divided into two 
main steps: the muscle force calibration in statics and the muscle force quantification 
in dynamics. 



2.1.1 Protocol Step 1: Force Calibration 

While EMG is reasonably proportional to the muscle force in static conditions [5], 
we must be aware that EMG is not sufficiently reproducible if the data recording 
does not exactly follow the same protocol. Consequently, we decided to calibrate the 
forces of the flexor and extensor muscle sets in isometric conditions, using proper 
EMG filtering and the Hill model [46]: here (Fig. 2, step 1), the elbow angle is 90° 
in the sagittal plane; the subject pulls on a rope fixed to a strain gauge on the floor 
(flexors activated), then the subject pulls on a rope fixed to this strain gauge via 
a pulley on the ceiling (flexors and extensors activated). This calibration gives the 
scale (factor K) and offset (factor A) that will be used to calibrate muscle forces 
during the flexion/extension, which will only be used as "not too bad" initial values 
of forces as input for the muscle overactuation solving (Fig. 2, step 2). 



2.1.2 Protocol Step 2: Force Quantification 

As soon as the calibration is completed, the subjects, still equipped with exactly 
the same optokinetic and EMG sensors, are engaged in weightlifting and perform 
several cycles of forearm flexion/extension (Fig. 2, step 2). For each of these trials, 
the model-based process comprises three consecutive steps, detailed in Sect. 2.4: 

1 . Kinematics identification: an optimization process that estimates the joint con- 
figurations, qmod, of the MultiBody System (MBS) that best fits the experimental 
joint configurations, qexp, measured by optokinetic sensors. The corresponding 
velocities, q, and accelerations, q, are determined using numerical derivatives. 
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Protocol step 1: Force pre-calibratiou in static conditions 
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Fig. 2 General principle of the muscle force quantification, featuring both protocol steps 



2. Inverse dynamics: an inverse dynamical model that provides the elbow con- 
tributive net torque Qtnv via recursive Newton-Euler equations of motion of the 
MBS, available in symbolic form [34]. 

3. Muscle overactuation solving: an optimization process that computes the forces 
of the flexor and extensor muscle sets during forearm flexion/extension. Start- 
ing from the K and A factors given by step 1 of the protocol and the filtered 
flexor/extensor forces during flexion/extension, this optimization adapts K and 
A so that the elbow net torque Qemg, computed from the muscle forces, given 
their insertion points, best fits the elbow net torque Qi^, given by the inverse 
dynamics. 



2.2 Experimental Set-up 



The measurement set-up (Fig. 3a) is composed of seven optokinetic sensors (Elite- 
BTS acquisition system [7]) fixed at the subject's joint landmarks (in Fig. 3b: points 
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Fig. 3 (a) Illustration of the subject carrying an 8 kg weight and performing several cycles of 
forearm flexion and extension, featuring the optokinetic sensors (reflecting spheres) and the EMG 
electrodes (double grips). Superimposition of the MBS (segments) on the basis of the weight con- 
figuration and the optokinetic sensors, (b) Description of the model, featuring the seven optokinetic 
sensors, based on anatomical landmark tables [18], that define the two articulated rigid bodies 
(the arm and the foreami), each defined by three points, articulated via two spherical joints (the 
acromion, representing the shoulder, and the elbow joint center) 



from 1 to 5, plus both extremities of the weight axis), and also ElectroMyoGraphic 
(or EMG) data (Elite-BTS), which allow us to calculate the activation of the consid- 
ered muscles during the motion. 

The optokinetic data are sampled at 100 Hz and filtered by a 15 Hz adaptive low- 
pass numerical filter. The EMG data are first sampled at 1 ,000 Hz, then synchronized 
with the data at 100 Hz, after being rectified and filtered by a 5th order Butterworth 
low-pass filter [46]. 

The experiments were performed by healthy subjects related to our laboratory, 
who gave their informed consent to perform the experiments. 



2.3 Model and Hypotheses 



First of all, the system is modeled as a constrained MBS, using kinematic loops and 
cutting procedure (ball cuts), in order to take the muscle connection into account. 
As we are using the coordinate partitioning method, independent and dependent 
variables are systematically used to model this closed-loop system. In detail, the 
MBS (the segments in Fig. 3b) is composed of two rigid bodies (the arm and the 
forearm) articulated around spherical joints (the shoulder and the elbow), giving a 
total of six independent variables. A third body, the hand, is attached to the forearm 
via a locked revolute joint. Finally, both flexor (biceps brachii) and extensor (triceps 
brachii) muscle sets are also considered as bodies with mass and inertia parameters. 
The shoulder joint is considered as the reference point of the model, given that 
the subject is supposed to keep the shoulder fixed during the tests: practically, we 
checked that the subject did not move the shoulder very much during each trial and 
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we systematically subtracted the values of the shoulder kinematic components from 
all kinematic data in order to consider the shoulder as fixed at the inertial reference 
point (0, 0, 0) in our model. 

A few characteristics and assumptions must be formulated about the different 
sets of inputs: 

• The shoulder joint being considered as the reference point of the model, the 
forces Fext and torques M„/ between the body and its environment, i.e. the 
"upper- shoulder", are constraint forces that will not contribute to the equations 
of motion, and thus do not need to be measured. They can possibly be computed 
using the action-reaction principle. 

• The body inertia parameters, i.e. the masses m j , inertia moments // and centre of 

mass positions OM j of the j ''' body member are taken from the tables of inertia 
from de Leva [18]. The inertia parameter identification is not part of this research: 
indeed, previous investigations [11] showed that non-invasive in-vivo dynamical 
identifications of body parameters are presently inappropriate for human body 
dynamics, because the resulting body parameters have significant inaccuracies 
due to experimental errors in the input data, related to the configuration of the 
bodies, and the external force and torque measurements. 

• The system configuration, i.e. the experimental absolute coordinates x„p(f ) of the 
reference points, are measured by the six optokinetic sensors. The corresponding 
joint coordinates q of the MBS are numerically determined by a kinematic iden- 
tification process and, after a low-pass filter, the corresponding velocities q{t) 
and accelerations q are presently estimated from q by numerical differentiation 
techniques. 

Considering the joint kinematics, we are aware that more adequate elbow and 
shoulder joint models could be used: in particular, previous studies [3] have de- 
veloped more complex three-dimensional joints for the shoulder. The present 
model has been implemented with spherical joints, which is sufficiently accu- 
rate [3 1] at present with relation to our muscle overactuation solving method, but 
the model should be extended in the future in order to include more sophisticated 
joints. Finally, let us note that the elbow flexion/extension is physiologically 
defined using only one DOR However, in practice, the axis orientation of the 
instantaneous relative velocity vector i?^ at the elbow continuously changes dur- 
ing flexion/extension. Consequently, it is very important to state that, instant of 
considering only one DOF for the flexion/extension, it is more accurate to con- 
sider three DOFs (i.e. a spherical joint) at the elbow in order to take into account 
the variation of the Qe axis, and then to further project the elbow vector torque 
Q onto the Qe axis, in order to obtain the elbow joint net torque g„„, along the 
instantaneous rotation axis that contributes to the motion of flexion/extension, as 
follows: 
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As introduced in Sect. 2.1, the protocol and the underlying calculation process are 
divided into two main steps: the muscle force calibration in statics (detailed in 
Sect. 2.4. 1) and the muscle force quantification (Sect. 2.4.2). 



2.4.1 Muscle Force Calibration 

While the EMG is reasonably proportional to the muscle force in isometric condi- 
tions [5], we must be aware that the EMG is not sufficiently reproducible if the data 
recording does not exactly follow the same protocol. Consequently, we decided to 
calibrate the forces of the flexor and extensor muscle sets in isometric conditions, 
using proper EMG filtering and the Hill model [46]: here (Fig. 2, step 1), the elbow 
angle is 90° in the sagittal plane; the subject pulls on a rope fixed to a strain gauge 
on the floor (flexors activated), then the subject pulls on a rope fixed to this strain 
gauge via a pulley on the ceiling (flexors and extensors activated). 

EMG processing, in static or dynamical conditions, can be divided into five se- 
quential steps to transform the raw EMG data of a muscle to its calibrated force, as 
illustrated in Fig. 4 and described in the following sections. 



EMG Rectification 

First of all, the raw EMG signal must be rectified, i.e. the absolute values of each sig- 
nal sample are taken, in order to take into account the fact that the EMG electrodes 
record a signal of muscle fiber polarization-depolarization [46]. 
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Fig. 4 Illustration of the EMG Processing, representing the sequential transformations from the 
raw EMG data to the normalized muscle force F{t) 
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EMG Filtering 

High-frequency noise, due to electrical interference such as wiring motion, must 
be removed from the signal by applying a low-pass filter that has zero-phase delay 
properties, so filtering does not shift the signal in time; in practice, we have chosen 
the commonly recommended forward and reverse low-pass 5th order Butterworth 
filter, with a cut-off frequency of 15 Hz in order to be sure to take the muscular 
contraction frequencies into account (knowing that the highest known muscular 
contraction frequencies are 12 Hz for a stressed muscle). Finally, this signal must 
be normalized with respect to its maximal value, in order to obtain a number be- 
tween and 1, representing the muscle fiber recruitment rate. Once this process is 
completed, the transformed EMG signal becomes the filtered signal labeled e(t). 

Neural Activation 

The muscle does not contract exactly at the time instant at which the motor unit is 
triggered [46]: indeed, there is a time delay T„e during which the muscle is preparing 
to produce force, and once the muscle begins contracting, the tension ramps up to 
a peak that does not coincide with the EMG peak. One simple way to model this 
delay is using a first order differential equation [46] between the filtered EMG, e(f), 
and the muscle excitation signal usually labeled u(t), as follows: 

e = (m - e)/rne (5) 

where r„e is the excitation time constant. Let us note that e(t) is still a number 
between and 1, also representing the muscle fiber recruitment rate. 



Activation Dynamics 

The muscle excitation u{t) can be related to the corresponding muscle activation 
a(?) by a non-linear first order differential equation [44]: 

a = {u — a)/xa(a, u) (6) 

where Xa {a , m) is a time constant that varies with activation level and whether the 
muscle activation level is increasing or decreasing [38,44]: 

ra{a,u)= rae((0.5 -h 1.5a) if u > a (7) 

T:deact/(0.5 + 1.5a) if u < a (8) 

where tact is the activation time constant and tdeact is the deactivation time constant. 
Finally, the muscle activity is normalized with respect to the maximum voluntary 
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Fig. 5 Representation of tfie 
Hill muscle model, including 
the contractile clement (CE), 
the parallel elastic element 
(PEE) and the series elastic 
element (SEE). /„, denotes the 
muscle length, /, the tendon 
length and /,„ , the global 
musculo-tendon length 




contraction activity [40,44], so that a(t) is a number between and 1, representing 
the muscle fiber recruitment rate. 



Contraction Dynamics 

The transformation from raw EMG data to muscle force is partially based on the 
well-known Hill model [43,44,46] and is widely spread (e.g. [2, 8, 15, 23, 29, 38-40]), 
for which the musculo-tendon complex is composed of the tendon and the muscle 
(Fig. 5): 

• The tendon - the elastic Series Element (SE) - is a "passive" wire that does not 
"generate" motion. 

• The muscle consists in a parallel Passive Element (generally noted PE) with an 
active Contractile Element (generally noted CE) that generates contraction of the 
muscle controlled by neural excitation and then its lengthening or its shortening. 
Contractions of the muscle are assumed to be iso-volume [46], i.e. without vol- 
ume change during contraction. 

Further, the musculo-tendinous force Fmt is computed using the contraction 
dynamics equation; 



active passive damping 



(9) 



where 

• F„a.x is the maximal isometric force that a muscle can perform; F„ax is evaluated 
hereby tables [24]. 

• a{t) is the muscle activation [44] obtained from (6), a number between and 1 
representing the muscle fiber recruitment rate. 

• Ff^(lfn), F^^ (vm) and Ff^{lm) respectively represent the active 
force-length, active force-velocity and passive force-length relations [38,43,46], 
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which are defined by the Hill model; let us note that the ~ sign above the vari- 
ables means that they are normalized, i.e. the force components ^f^, F^^ , F™ 
are normalized with respect to F,„ax, the muscle length /„ is normalized with 
respect to the muscle optimal fiber length Lm' (given by [24]) and the muscle ve- 
locity 'vm is normalized with respect to the muscle maximal contraction velocity 
Vmax (given by [24]). 
• bm is the damping factor, experimentally set to 0.1 by [25]. 

As the last step of the process, Fmt is computed using the contraction dynamics 
given by (9). In this equation, we can observe that a{t) is given by the activation 
dynamics, that /„ andT^ can be computed from the muscle kinematic measure- 
ments, and that bm is set equal to 0.1 by [25]. Consequently, we will essentially 
describe in this section the formulations of the last elements of (9), i.e. the force 
components Ff^, F^^ and Ff^, on the basis of different references, as these 
formulations are continually improved: 

1 . The active force-length relationship Ff^ of the muscle is represented by a Gaus- 
sian function [38,43]: 

pCE ^ ^-(T,n-lf/y (jQ^ 

where /m is the normalized muscle fiber length, and y is a shape factor that 
approximates the force-length relationship of individual sarcomeres, set to 
0.45 [38]. 

2. The active force- velocity relationship F^^ of the muscle is represented by the 
following function [38,43]: 



pCE^ _^(^^(L25±0J5a\ if pCE ^ ^J 
V I'm / 



CE 
I 



(11) 



V'm(2+2/Af)F',', 



+ 1 



(o.25+o.75.)(Fif;--i) ifF^E^aFf^ (12) 

v,„{2+2/Af) ^ 



(0.25+0.75a)(f'< 



where 

• a is the previously computed muscle activation. 

• Tm is the normalized muscle contraction velocity. 

• F'™ is the maximum normalized muscle force achievable when the fiber is 
lengthening, set to 1.4 for young adults [38]. 

• Af is a force-velocity shape factor, which was set to 0.25 [44]. 

Let us note that (11) and (12) must be evaluated in a first time. Then the choice 
between the formulations of (1 1) and (12) can be made a posteriori by evaluating 
the inequalities between F^^ and apf^. 
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3. The passive force-length relationship Ff^ of the muscle is represented by an 
exponential function [38,43]: 



7PE 



e'^ir - l)/e° - 1 



Fr = .... _ 1 "^ (13) 



where 



• k^^ is a shape factor, set to 5 [38]. 

• sjjj is the passive muscle strain due to maximum isometric force, set to 0.6 for 
young adults [38]. 

Despite the fact that this transformation process is widely spread, it is not trivial 
because it is based on a model, on factors given by statistical tables, and on raw 
EMG data that do not directly measure the muscle activation but the external electro- 
magnetic field generated by this activation. 



2.4.2 Muscle Force Quantification 

The method will be developed considering the consecutive steps introduced in 
Sect. 2.1.2, i.e. the kinematics identification that defines the model motion, the in- 
verse dynamics that computes the elbow joint torque, and the proposed method to 
solve the muscle redundancy problem. 



Kinematics Identification 

Let us note from the start that the inverse dynamics is a familiar tool to obtain results 
of joints efforts, but it is not obvious to obtain accurate results that could be usefully 
exploited, e.g. for the joint analysis of pathologic cases or the design of intelligent 
prostheses. This problem is illustrated in Fig. 6a during one cycle of forearm flex- 
ion/extension, for which the elbow joint torques have been computed using inverse 
dynamics, using raw measured kinematic and dynamometric data. Concerning this 
method, [9] and [10] indicate that the estimate of the internal forces is particularly 
sensitive to accelerations. Those are classically calculated by the method of finite 
differences starting from the measured positions, which dramatically amplifies the 
measurement errors. 

Further, let us insist on the fact that filtering the undesired high frequencies in 
the measured kinematic data using a low-pass filter, and even smoothing the corre- 
sponding derivatives, i.e. velocities and accelerations, is not sufficient, as illustrated 
in Fig. 6b during the same cycle of forearm flexion/extension, for which the elbow 
joint torques have been computed using inverse dynamics. 

Today, the most accurate way to obtain results of joint efforts is using inverse 
dynamics with corrected kinematics based on a kinematic identification process. 
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Fig. 6 Schematic representation of inverse dynamics methods during one cycle of forearm 
flexion/extension: (a) Pure inverse dynamics, directly using the measured kinematics (totally in- 
accurate); (b) Inverse dynamics with filtered kinematics using a numerical low-pass filter (still 
inaccurate); (c) Inverse dynamics with corrected kinematics using a kinematic identification pro- 
cess (the most accurate way) 



also known as solidification method [12], as illustrated in Fig. 6c. Concretely, this 
process estimates the joint coordinates of the MBS that best fit the experimental 
joint positions Xexp,s- This kinematic optimization can be formulated as a nonlinear 
least-square problem applied for each body configuration, at each time instant t^, 
k = 1, . . . , r, where T is the last time sample of each test. Consequently, the cost 
function fcost(tk) can be written at each time instant t^ as follows: 



fcost{tk) = 2_^ \Xmod,s{q(tk)) - Xexp,s{tk)? 



(14) 



s = \ 



where: 



the index s = 1, . . . , risens indicates the optokinetic sensor. For instance for the 

upper limb, risens = 7 as described in Fig. 3b). 

qitjc) is the vector of joint relative coordinates of the MBS at the time instant t^, 

and is the variable of the optimization process. 

Xmod,s{ii{tk)) is the Cartesian coordinate of the s'^ optokinetic sensor at time 

instant t^, obtained from q(tk), using the direct kinematic model of the MBS. 

Xexp,s{tk) is the Cartesian coordinate of the s'^ optokinetic sensor at time instant 

tk, provided by the experimental set-up. 
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Fig. 7 Optimization process for a body configuration, at a time instant t^ 



Figure 7 schematically outlines the optimization process, which involves two 
consecutive steps: 

1 . A pre-process calculates the mean distances // between the joints for each of the 
jth body segment, using the experimental joint Cartesian coordinates X^y^p^si^k)- 
The reason is that the approach is based on an MBS composed of rigid bodies, 
for which a variable size of the bodies would be irrelevant. 

2. The model joint Cartesian coordinates X„,od,s are given by a direct kinematic 
model using the Ij distances and an initial value (set to zero) of the joint coor- 
dinates q{tk) that we want to determine. The cost function of this least-square 
optimization (14) is defined as the sum of the square components of the absolute 
error vector between Xexp^sitk) and X,„od,s{<l(tk)) of the ns„,s optokinetic sensors 
at the time instant t^. In order to improve the numerical convergence, the optimal 
value of q{tic) is obviously chosen as the initial condition of the next iteration at 
time instant f,t+i- 

Finally, now that the configurations q are properly corrected, they can be filtered 
by precaution using a low -pass filter. Then, the corresponding velocities, q, and 
accelerations, q, can be obtained via numerical differentiation techniques. 



Inverse Dynamics 

The multibody dynamical equations are obtained from a Newton-Euler formal- 
ism [34]: this algorithm provides the vector Qi„y of internal interaction torques and 
forces at the joints for any configuration of the MBS, in the form of an inverse 
dynamical model (15), or a semi-direct dynamical model (16): 



Qi„v = 4>{q,q,q, F,^,, M„„ g) 

= M{q)q + c(q, q, Fext, M^x,, g) 



(15) 
(16) 
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where: 

• ^ (7 X 1) is the vector of the human body relative generalized coordinates, i.e. 
successively 3 arm rotations + 3 forearm rotations + 1 hand rotation' = 7 
components. 

• q and ^(7x1) are the joint velocities and accelerations, respectively. 

• M(q) (7 X 7) is the generalized mass matrix. 

• c(q, q, Fext, Mext, g) (7 x 1) is the dynamical vector containing the gyroscopic, 
centripetal, Coriolis terms as well as the external forces Fg„ (7x3) and torques 
Mcxt (7 X 3) and gravity g {I x 3) applied to the system. 



Muscle Redundancy Solving 

As introduced in Sect. 2.1.2, the key point of our muscle overactuation solving is 
based on an optimization process that computes the forces of the flexor and ex- 
tensor muscle sets during forearm flexion/extension. Starting from the K and A 
factors given by step 1 of the protocol and the filtered flexor/extensor forces during 
flexion/extension, this optimization adapts the K and A factors of the flexors and 
extensors so that the corresponding elbow net torque Qemg best fits the elbow net 
torque Q,„,, given by the inverse dynamics, on the whole trajectory. Practically, this 
problem is formulated using the non-linear least squares method 'Isqnonlin' provide 
by Matlab. 



3 Results 

3.1 Joint Kinematics and Dynamics 

Fig. 8a presents a comparison of the model configuration given by the experimen- 
tal data and the resulting configuration obtained using the kinematics identification 
process. Figure 8a also features the corresponding direction of the instantaneous 
rotation velocity vector onto which the elbow joint torque is projected in order to 
obtain the elbow net joint torque Qemg, corresponding to the torque component that 
contributes to the 'one DOF rotational' motion. Let us note that the other torque 
components also exist, but these ones contribute to the joint cohesion efforts (con- 
straint torques). 

Figure 8b presents the time evolution of the corresponding computed elbow joint 
net torque Qinv during the test, obtained via inverse dynamics. 



' momentarily blocked. 
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Fig. 8 Subject carrying a weiglit and performing several cycles of forearm flexion and extension: 
(a) comparison of the model configuration given by the experimental data and the resulting data 
from the kinematics identification process, and also featuring the weight and the corresponding 
direction of the instantaneous angular velocity vector i? g ; (b) time evolution of the coiTcsponding 
computed elbow joint net torque g,„,. during the test. 
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Fig. 9 (a) Comparison between the elbow joint net torque g;„„ obtained from inverse dynamics, 
the equivalent joint net torque obtained from the calibrated Forces F/&(.,/ (output of protocol 
step 1) via (1), and the equivalent joint net torque obtained from the optimized Forces F fg^cF 
(output of protocol step 2) via (1); (b) Comparison of the solutions of the Biceps brachii and 
Triceps brachii set contributions at one time instant t = ?*: the solution of Ff^^i given by 
the pure calibration, the solution of Ffsie.F given by our proposed optimization process, and the 
solutions of strategies that maximize endurance [13], i.e. that minimize the sum of the weighted 
forces at square or at cube 



3.2 Muscle Forces 



3.2.1 Muscle Force Assessment 



First, Fig. 9a presents the time evolution of the elbow joint net torque Qim obtained 
from inverse dynamics, compared to the equivalent joint net torque obtained from 
the calibrated Forces Ff^ej (output of protocol step 1) via (1), and also compared 
to the equivalent joint net torque obtained from the optimized Forces FfSce,F (out- 
put of protocol step 2) via (1). 
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Secondly, Fig. 9b compares several solutions of the biceps brachii (flexors) 
and triceps brachii (extensors) set contributions at one time instant t of the flex- 
ion/extension: 

1. The solution of Ffs^gj given by the pure calibration (output of protocol step 1, 
and input of protocol step 2). 

2. The solution of Fy&g ^ given by our proposed optimization process (output of 
protocol step 2), corresponding to the re-scaled and re-shifted calibrated value of 

Pf&eJ- 

3. The solution of a strategy that maximizes endurance [13], by minimizing the sum 
of the weighted forces at square or at cube. 



3.2.2 Statistical Validation 

For one subject, Fig. 10 presents the results of muscle forces during forearm flex- 
ion/extension, using the final values of the K and A factors during the muscle 
overactuation solving process. 

First of all. Figs. 3.2a and 3.2b clearly - and fortunately - show a gradation of 
the forces according to the weights carried. More fundamentally, a statistical vali- 
dation of this muscle effort quantification method was performed with six male and 
six female subjects carrying five different weights (from to 4 kg) with several flex- 
ion/extension frequencies {\, \ and | Hz). This validation shows a good inter-test 
reproducibility (also showing a fatigue effect) and a very good correlation (corre- 
lation factor r = 0.99) between Qinv and Qemg at the end of the identification 
process. 
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Fig. 10 Results of (a) the summed flexor (biceps brachii set) forces and (b) extensor (triceps 
brachii set) forces for one subject carrying weights from to 4 kg and performing cycles of forearm 
flexion/extension at 0.5 Hz 
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4 Discussion 

4.1 Joint Kinematics and Dynamics 

The present inverse dynamical model of the human body (Fig. 1), necessarily pre- 
ceded by a kinematic identification of the model configurations, is proposed as 
a satisfying method to estimate the joint efforts in dynamical context [31]. This 
problem being deterministic, Qi„v becomes a sufficiently accurate result that can 
be exploited as a reference for the optimization process that attempts to solve the 
muscle force redundancy (protocol step 2). 

Nevertheless, let us point out that three main limitations of the present inverse 
dynamical model must be discussed in order to improve the process in the future: 

1 . A geometrical limitation, due to the use of spherical joints: The results of the 
kinematic analysis for this experiment show that the spherical joints consid- 
ered here sufficiently fit the envisaged motion, with a mean absolute error on 
the Cartesian coordinates that is inferior to 3 m m in each direction at each joint 
(shoulder -|- elbow). However, using previous investigation results, in order to 
model more complex 3D motions, the present model will be extended to include 
more involved joints in the future, particularly to model shoulder [3] that is far 
from being a spherical joint. 

2. A kinematic limitation, due to the rigid MBS assumption: Like other classical dy- 
namical inverse analyses [17, 19,36,45] in Biomechanics of motion, the proposed 
model is composed of rigid segments. However, in reality, the body is not com- 
posed of a set of rigid bodies. Rather, each body member consists of a rigid part 
(bone), and a non-rigid part (skin, muscle, ligament, tendon, connective tissue, 
and other soft tissue structures) [26]: during any motion, the skeletal structure 
of the body experience accelerations, whereas the soft tissue motion is delayed, 
due to damped vibrations of the member. Consequently, the errors in the opti- 
mized joint coordinates q may introduce kinematical errors in the velocities q 
and accelerations q, and thus introduce errors in the estimation of the internal 
efforts [31]. 

3. A dynamical limitation, due to the approximation of the body inertia parameters: 
The body inertia parameters, i.e. the masses, moments of inertia and centre of 
mass positions of the body members (the arm and the forearm) are approximated, 
using inertia tables [18]. Consequently, the errors in the estimated net joint efforts 
Qi„y increase if the corresponding body member accelerations increase. This is 
the reason why the present model is only proposed for rather low dynamics, such 
as our tests of forearm flexion/extension with a cycle frequency from | to i Hz, 
or other tests such as getting up from a seat [31], gait experiments [30] or other 
motions without significant dynamics or impact. 
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4.2 Muscle Force Quantification 

If we make the assumption that the optimized kinematics and the net joint efforts 
Qinv resulting from inverse dynamics are sufficiently accurate to attempt to solve 
the muscle overactuation problem (as discussed in Sect. 4.1), we still have to deal 
with two main problems [32] in order to estimate the muscle efforts: 

1. the noisy-corrupted raw EMG signals; consequently, we will discuss in 
Sect. 4.2.1 the role of the EMG processing, including the Hill model and its 
numerous parameters, and the choice of parameters that we decided to vary in 
the optimization process (Sect. 2.4.2). 

2. the undetermined number of solutions of muscular efforts; consequently in 
Sect. 4.2.2, we will compare our proposed solution with the main solutions of 
'maximum endurance' strategy and 'EMG calibration' introduced in Sect. 1.2. 

4.2.1 EMG Processing and Parameter Choice 

The raw EMG signals are noisy-corrupted and do not provide accurate quantitative 
values of the actual muscle excitation u{t) and consequently of the actual muscle 
force Fm ■ However, we guess that this information, combined with a matured EMG 
processing (Sect. 2.4.1) and a muscle force calibration process can provide a first 
estimation of the muscle force that will constitute a reasonable initial value of the 
optimization process (Sect. 2.4.2), which will compute the predicted muscle forces 
during the motion. 

Further, if we attempt to solve the muscle overactuation problem using an op- 
timization process, "too many parameters is not good" (Buchanan et al. [8]): on 
the one hand, increasing the number of parameters to vary for a specific test shall 
certainly help to better fit the estimated joint torque Qemg to the joint torque Q^v 
obtained from inverse dynamics; but on the other hand, increasing the number of pa- 
rameters may increase the sensitivity of the optimization process, and thus decrease 
the repeatability of the muscle force results from one test to another. 

In consequence, we decided to use the EMG processing (Sect. 2.4.1), and more 
specifically the Hill model, as a black-box based on parameter tables, so that we 
will not have to deal with too many parameters in the optimization process, and we 
finally prefer to deal with the scaling K and the shifting A factors of the muscle 
forces Ffs^ej resulting from the Hill model. 

4.2.2 Comparison to Existing Metliods 

When using an anatomical model of the musculoskeletal system, an infinite unde- 
termined number of solutions of muscular efforts can correspond to the joint efforts 
Qinv, because the joints are overactuated. In order to find the best physiologically 
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admissible solution, we decided to estimate the muscle forces using kinematic data, 
the Qinv that becomes our "reference value", and the information contained by the 
EMG signal. 

The comparison between all considered solutions (Fig. 9b) shows that: 

1. The solution of Fy&e / given by the pure calibration does not minimize Qjnv — 
Qemg but, being not to far from the final solution, their role as initial values of 
our optimization process was useful. 

2. The solution of Ffg^g^p given by our proposed optimization process minimizes 
Qinv — Qemg with a quantifiable difference between Qinv and Qemg (Fig- 9a) 
and this can be proposed as an admissible solution of muscle force repartition. 

3. The solutions of strategies that maximize endurance [13] by minimizing the sum 
of the weighted forces at square or at cube do not seem to minimize Qinv— Qemg ', 
these are just approximation of the human motion strategies; further, we can no- 
tice that the solution of F/&g ^ given by our proposed optimization is generally 
superior to the forces given by these strategies, which tends to confirm that the 
actual force repartition during motion does not always maximize endurance, and 
more generally varies from a strategy to another during the motion, as noticed 
Crowninshield et al. [13] who developed and discussed these strategy models. 

To end, the statistical validation will show a good inter-test reproducibility (also 
showing a fatigue effect) and a very good correlation (correlation factor r = 0.99) 
between Qi„v and Qemg at the end of the identification process. This reinforces that 
our method of muscle force calibration plus optimization, based on muscle activity 
measurements, is a promising alternative to the methods based on the strategies 
usually used in the literature. 

In summary, today, we can say that there are as many solutions of muscular 
efforts as methods that attempt to predict these. Nevertheless, the solution of Ff&e,F 
given by our proposed protocol is coherent with the joint kinematics and system 
dynamics, while exploiting as well as possible the experimental information on the 
muscle activation. 



4.3 Prospects 

The main prospects of this research is to quantify with a satisfying accuracy the 
main muscle set efforts of subjects in different dynamical contexts, and to apply the 
model to develop: 

• complementary diagnostic and follow-up tools for several pathologies such as 
lumbalgy, hemiplegy, scoliosis and pathologies that can hardly be diagnosed, 
such as fibromyalgy 

• dimensioning tools for the design and choice of adapted prostheses by physicians 
and clinicians 

• studies of effort hardness and ergonomic adaptation, e.g. when people with 
disabilities go into/out of vehicles 
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Computing Time Reduction Possibilities 
in Multibody Dynamics 

Thorsten Schindler, Markus Friedrich, and Heinz Ulbrich 



Abstract This document discusses computing time reduction possibilities for tiie 
dynamic simulation of large nonsmooth multibody systems. Exemplary regarding 
the model of a pushbelt continuously variable transmission known from literature, 
simplifications are derived to examine the underlying problems. Usually one has 
to deal with the calculation of an adequate initial state and the reduction of the 
computational effort during integration. In three sections stationary belt models for 
the determination of an initial value are proposed, time step size enlargements due to 
implicit time integration schemes are analysed and the distribution of the main effort 
per time step on several central processing units by shared memory parallelisation 
is investigated. 



1 Introduction 

One area of research at the Institute of Applied Mechanics of the Technische 
Universitat Miinchen is the field of nonsmooth multibody systems. These special 
mechanical systems basically include rigid bodies and in space discretised de- 
formable bodies [25] in an hybrid way. They are additionally characterised by rigid 
unilateral and bilateral contacts as well as impacts, which lead to discrete jumps 
within the system's velocities. So the degree of freedom is not a constant function, 
but changes during the simulation process and determines a time-variant topology. 
Like described in [26] a unitary mathematical and numerical formulation based on 
measure differential equations (MDE) and projection constraint functions was pro- 
cessed in the last decades at different research institutes, summarised e.g. in [1,3,4], 
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in [9, 17] and in [8, 14, 15]. It allows for the efficient integration even of industrial 
systems with large numbers of transitions [16] and avoids both high artificial stiff- 
nesses and additional modelling errors due to regularised interactions. 

The software for modelling and simulation of nonsmooth dynamical systems at 
the Institute of Applied Mechanics is called MBSim [10]. It is able to handle nearly 
arbitrary dynamical systems according to 

q = Y{q)u, (1) 

M(q)k = h{q,u,t) + W (q)X, (2) 

(^,M,A,Oe^ (3) 

with the transformation matrix Y for the generalised velocities m, the mass matrix 
M and the generalised contact directions W possibly depending on the positions q. 
The time / explicitly occurs in the right hand side h and in the algebraic constraints 
^ for the solution of the interaction reactions A . 

For special complex applications some improvements are necessary to enhance 
the behaviour of the underlying solution algorithms and to further reduce the 
effective computing time, significantly. One example of such challenging appli- 
cations is the computational description of the pushbelt Continuously Variable 
Transmission (CVT) where an input and an output pulley as well as the pushbelt set 
up the variator of the transmission system {left side of Fig. 1). Thereby, each of the 
pulleys consists of a fixed and an axially moveable V-shaped sheave. The pushbelt 
is composed of approximately 400 elements which are guided by two ring pack- 
ages of nine to twelve steel rings {right side of Fig. 1). The differential equation of 
a planar model has been derived and validated [11,19]. Reference [20] presents the 
extension to spatial dynamics also indicating the numerical difficulties concerning 
the extremely increasing computing time in comparison to the planar case. 

Within this framework, the current paper reproduces simplified three-dimensional 
CVT models to examine the underlying problems and deals in detail with computing 
time reduction possibilities in multibody dynamics. 





Fig. 1 Pushbelt variator and pushbelt with elements 
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2 Determination of a Suitable Initial Value 
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In the majority of industrially relevant cases for oscillatory mechanical problems 
the analysis of a stationary target state is of primary interest. So, it is preferable 
to calculate a suitable initial value for positions and velocities fulfilling a stationary 
equality of forces and torques. This at least reduces the decay time of high-frequency 
vibrations at the beginning of the time dependent simulation. The calculation of such 
an initial value depends on the specific simulation model. One possibility is to derive 
a further abstraction of the dynamic model equations. Within the thematic scope 
one can e.g. consider the various stationary continuous belt models [18,21], which 
promise to save about 0.2 s unnecessary real simulation time. Though, implications 
for the computing time depend on the relationship of the complexity of the dynamic 
and the stationary model. 



2.1 Kinematics 



With Fig. 2, it is ir = ^^ the initial transmission ratio of the neutral fibre's output 

and input radius, II = i/sin^ {(p) d\ + c/^; the trum length as well as the arc 
lengths of the neutral fibre given by bj = 2rj {n — cp) and bo = 2ro(p- It holds 



ri -ro = cos ((p)d A, 
Ir = 2tL +bi +bo 



(4) 
(5) 



with axes distance cIa and belt length Ir. Thereby, the alignment c?aiign has to be 
calculated with the relative position of in- and output pulley again depending on 
the transmission ratio. The system of nonlinear equations (4)-(5) in the unknowns 
r/ and cp e [0, Ijt) is solved with a NEWTON-method using numerical JACOBIAN 
evaluations and the initial values 




sin {(p)d^ 



Fig. 2 Initialisation (axial view and view A-A) 
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n^ = (0.5 lR-dA)/7t, (6) 

(fs = 7t/2 (7) 

from a setting with iy = 1 . The results are used to define position values of the 
specific pushbelt CVT parts and to form a basis for the calculation of velocity esti- 
mates in the next sections. 



2.2 Kinetics 

In [21, Chapter 3.5] a planar velocity initialisation of a CVT is done successfully 
with a stationary kinetic belt model [12, 18]. In the following, the missing veloc- 
ity description is derived in a similar way also not regarding outer plane effects. 
Concerning the belt model, the following assumptions have to be kept in mind: 

• Planar continuum belt model with only longitudinal elasticity, 

• Circle shaped pulley enlacement, 

• Symmetrised wedge angle at the pulleys, 

• Constant average COULOMB friction in the belt sheave contact, 

• The influence of belt enlargement on radius change is neglected. 

From the kinetic point of view the load torque induces different longitudinal forces 
in the trums because of friction forces between sheaves and belt. The difference of 
these forces compensates the excitation and is the same for the input and output 
pulley. First, the associated force propagation along an arc is derived; last, global 
estimations yield the final equations. 



2.2.1 Geometry of the Excitation 

Using Fig. 3 of an infinitesimal sheave sector with opening angle d(p, axis z, tangent 
t and radius direction r in the disk rotating coordinate system, it is g the relative 
elastic slip velocity at the cone surface. The relative elastic slip velocity in the pro- 
jection plane is gi, with y being the angle of elastic slip, S the local half wedge angle 
including deformation and Ss its effective part. The decomposition of the friction 
force [idN in the axial, tangential and radial direction is given by 

fj.dNsm{8s), (8) 

/xdA^cos(i55) sin(7r — y) , (9) 

lidN cos (Ss) cos (jt — y) . (10) 

Comparing the direction of g for y = jr given by g,^ and for y = jt/2 given by 
g7z/2 yields 

tan(S5) = -tan(S)cos()/). (11) 
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HdN cos(ds) cos(n - y) 

^idNcos{5s)sin(7i- 7) 



fidN sin (Ss) 



trace of belt point 




Fig. 3 Belt slip at sheave 



Fig. 4 Absolute velocity 



tangential. ...\Jv 




2.2.2 Relationship of Tangential and Radial Belt Velocity 



In Fig. 4 one can see the trace of a belt point from axial point of view in a global 
coordinate system. So in addition to tiie relative velocities of Fig. 3, also the ro- 
tational velocity of the sheave ro) is drawn as a positive initial velocity and the 
curvature angle & describing the change of the running radius r with respect to cp is 
defined. This results in formulas for the relative tangential and radial velocity 
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^*i.„ = vcos(??) -ro) = v-ra, 



^„. = ^='-« 



with the hnearised beh velocity 



1 + 



EA 



vo 



T. Schindler et al. 
(12) 
(13) 

(14) 



depending on its undeformed part vq, the longitudinal stiffness EA and the tensile 
force L. Together it is 



, , , g*.., [EA + L]Cs-r 
tan(y) = = -, 



with the elastic slip constant 



assuming co ^ 0. 



Cs 



vq 
EAco 



(15) 



(16) 



2.2.3 Tangential Balance of Forces 

Figure 5 shows the axial view to write down the equality of forces in tangential 
direction 



[dL — mdv] cos (d^/2) = IfidN cos (Ss) sin (ji — y) 



(17) 



2ndN cos(5s) siii(7r- 7) 



cos 
sin 

L+dL-m (v+dv) 



Fig. 5 Axial view 
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involving the double tangential friction force from (9) on the right hand side because 
of a planar description and the constant stationary temporal mass distribution 



m = m vo 



(18) 



resulting from the local mass distribution m* of the undeformed belt. This is needed 
both to simplify the equations and to improve their entropy according to [2 1 , Chapter 
3.5.1]. Because of (14), it is 



dL 

dv = — vo. 



(19) 



2.2.4 Radial Balance of Forces 

Figure 6 can be used to write down the balance of forces in radial direction 

[2L + dL — rh (2v + dv)] sin (d^/2) = 2dA'^ [sin (S) + fj. cos (Ss) cos (ir — y)] . 

(20) 

Thereby, the internal values appear on the left hand side, whereas the radial 
component of the friction from (10) and of the normal force contribute to the 
right hand side. 



2.2.5 Axial Balance of Forces 

Also the axial equilibrium of forces can be explained with Fig. 6. With the sheave 
expansion force S it is 



d5 = dN [cos (S) - II sin {8s)] ■ 



(21) 



dS 



Fig. 6 Radial view 



2/j dTVcos ( 5s) cos (K-y) 



fidNs'mids) 




sin-projection 

cos-projection 

[2L+ dL]sin(dip/2),-m[2v+ dv]sin(d(p/2) 



COS 
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2.2.6 Change of the Running Radius 

The equations (8)-(21) show the framework of [12, 18, 21] in its full generality. In 
contrast, changing of the running radius due to sheave tilting, transverse belt elas- 
ticity and axial sheave elasticity are not considered in the following because of their 
minor influence for the initialisation [21, Figs. 2.8 and 4.10]. These assumptions 
involve 

8' = 0, r' = 0, Y = ±jr/2, Ss = (22) 

and avoid numerical difficulties mentioned in [21]. 

2.2.7 Summary of the Stationary Belt Model and Environment Interaction 

From (17) and (20) it follows by the linearisations 

fd(p\ . . fd(p\ .dip .dip 
;(^-j = l. sm(^-j = -,dL-=0. (23) 

some addition theorems and the consequences of the last paragraph 

dip [1 — mft)Cs] sin (i5o) sin((5o) [E A — mvo] sin (Sq) 

using the intermediate step 

dL = mdv + IfxdN cos (Ss) sin (y) = mdv ± 2jidN, (25) 

Ldip=mvdip + 2dN [sin (S) — /x cos (Ss) cos (y)] = mvdip + 2dN sin (So) . (26) 

Equation (24) is an Euler-Eytelwein description of the longitudinal force prop- 
agation along an arc resulting in 

L (ip) = [Lo - K] e±'^*('P-w) + K (27) 

with 

u m*VnEA 

At := , , , K ■- 5 . (28) 

sm (do) EA — m*VQ 

If the denominator in the definition of K equals zero this would be a contradiction 
to (25) and (26). 

The global belt setting is drawn in Fig. 7. The input arc E—A, the output arc 
B—D, the tight span D—E, the slack span A—B, the coordinate system and the 
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radii ro, rj are canonical notations. The longitudinal force at the end of the input 
arc Lj , occurs at point A. Similar, Lj can be found at E, F and D, Ln at B 
and C as well as L o„„i at D, E and F . Generally, the longitudinal force increases 
at the output and decreases at the input arc only in an active part 0, but is always 
positive [21, Chapter 4.1.1]. When modelling elastic sheaves, the points F and C 
are called orthogonal points [18]. Altogether, this yields a consistent longitudinal 
force equivalence 



resulting in 

Lo,„ = K V <Po = ^i- 
Sticking implies two additional equations 



+ K 



(29) 



(30) 



ricoi = vo 



(-1^) = 



Vo 



1 + 



EA 



(points£-f), (31) 



The equality of torques 



= vo (l + 



EA 



(points B — C). 



Mo = ro (Lo,„, - LoJ = ro {Lo„, - K) (6^^**0 - l) 



(32) 



(33) 
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with Mo being the positive load torque and the axial equality of forces 



f , f L\EA-m*vl]-m*vlEA 

Fco = / 5'd^ = / -^ ^^ ,l\j,, ° dy , 

Jq,o Jcpo 2 tan (So) EA 



EA — ot*Vq 
2 tan (So) EA 

EA-m*vl 
2 tan (So) £^ 



Ld(^ + / Ld(^ 



*,,2 



tan (So) 



V 



2Lo.„^ + (Lo„, - ^) ( ——^ ^o 



*,,2 



m v: 



•^ 



tan (So) 
(34) 



have to be solved concerning the output pulley. Hereby, Fcq is the positive axial 
clamping force at the output pulley as well as (26) and (21) yield the expression 
for S'. 



2.2.8 Reduction of the Final Equations 



Goal of this paragraph is the computation of ci>o,vq,<Pi,<Po and L o,„ for initialising 
velocity values. Of course, it is £ > 0, ^ > 0, So > 0, </) > 0, m* > 0, r/ > 0, 
rp > 0, a)/ > and so vq > for practical settings. Then clearly 



EArooio „ , 
Lo = EA 

VQ 



(35) 



according to (32). Concerning (30)-(34) there are two cases. 

1. For Loi^ = K it follows necessarily Mo = and Fcq = 0. The condition 
Fcq = means that there is no sheave-belt contact, which is practically not 
relevant, and Mo = yields Lo^^ = K or 0o = from the mathematical 
viewpoint assuming ji > 0. From the physical viewpoint obviously only <Po = 
and so also <P/ = is interesting. So, it is 



ri 

a>o = — Oil, 
ro 



(36) 



VQ 



y{Fco tan (So) + EA(p)^ + 4EAm*rfa)](p^ - EA(p - Fcq tan (So) 



2m* r J 0)1 (p 



(37) 



2. If Mo 7^ it is necessarily 0/ = <Po and Lo„, ^ K. So, 



fi* 



Mo 



ro {Lo,„ - K) 



+ 1 



(38) 
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being only defined for Mo > ro {K — Lo,J- This condition depends on the 
kinematic and kinetic setting and states the physical application in a same way 
as the inequalities 



0/ < 2(jT — cp) , 0O < 2(p 



(39) 



define the active arcs. Equations (31) and (34) remain only depending on coq 
and vq. They can be solved by a generalised NEWTON method with numerical 
Jacobian evaluation and starting values from the Mo = case. 

Now, CD J and coo can be used for the initialisation of the pulleys' and ring packages' 
angular velocity. According to (14) and (27) the expression 



1 + 



(eA (^ - l) - k\ Qi^*^<f>->Po) + K 



EA 



vo 



(40) 



explains the behaviour of the ring package belt velocity in an active arc starting from 
Vin at ^ = cpo. Element velocities have to be inherited from the ring packages. 



3 Computational Effort During Integration 

The computational amount during the integration of a differential equation can be 
divided in the amount per time step and the number of time steps as a multiplier. 
In the following the spatial simulation of a flexible belt with rigid elements linearly 
arranged is discussed as a benchmark problem (cp. Fig. 8). The main effort per time 
step is defined by the kinematic element and finite element update and the contact 
behaviour; the time step size itself is declared by the numerical stiffness of the flexi- 
ble part. As the contact iterations cannot be simplified conceptually, in the following 
the element update loop parallelisation and the ring package model stabilisation are 
analysed to reduce the overall CPU time. 



Fig. 8 Rigid elements with 
internal unilateral contacts on 
flexible belt 
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3.1 Stabilising Equations of Motion 

According to [20] the flexible part is divided in Nb spatial large deformation beams 
giving a lot of modelling possibilities available in the literature [24]. As a co- 
rotational [6] model in [25] shows an efficient behaviour in the planar case compared 
to ANC formulations [7, 13, 22] it is extended to three dimensions using inertial 
approaches [2, 23] and applying the physically interpretable Euler-Bernoulli 
beam formulation. The mathematical derivation is based on the ideas of finite el- 
ement theory for assembling and multibody formulations for the evaluation of the 
equations of motion for each finite element. 



3.1.1 Coordinate Settings 

Using a general stationary frame of reference, the entire kinematic of one finite 
element can be described like in Fig. 9. Using a reversed Cardan parameterisation 

</)o (x) := (Psq + Wq {x) , (fi (x) := (psi + w\ (x) , (p2 {x) := (ps^ + w'^ {x) 

(41) 

a set of internal coordinates 

Qi := ixs,yS,ZS^(PSO'VSi-'PS2'^''^Ll,dRi.PLi,PRi,dL2,dR2.PL2'PR2-'<^o) 

(42) 

is defined by the position vector and the angle parameterisation of the trihedral of 
the finite element centre as well as the longitudinal strain, the coefficients 

Wi {-la/2) := dti, Wi (0) := 0, w; (/o/2) := ^r,-. i = 1,2, (43) 

w\ i-lo/2) := PLi, w\ (0) := 0, w\ (/o/2) := Pr^, i = 1,2 (44) 



du dii Pl] Pl2 




Fig. 9 Internal coordinates 
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of the ansatz functions with the finite element length /q and the torsion 



kq := /b • in! = Wq - sin {tpsi ) 



w, . 



The degree of the polynomials 

Wi := a„.x^ + bn.-x'^ + c^-.x^ + d„.x'^ , i = 0, 1, 2 



251 



(45) 



(46) 



is a compromise between too much stiffening for lower order and too much support 
for higher order with the coefficients of vvq being constrained by the constant tor- 
sion characteristics of (45). Altogether, rigid and elastic body motion are decoupled 
and a compact form of the equations of motion with appropriate approximation not 
depending on the boundary conditions is available for evaluation. 
For coupling of finite elements the global coordinates 



qg := {xi,yL,ZL,VLo,VLi,VL2'CLi,CR^,ci,,CR^,XR,yR,ZR,(pRii,(PRi,(PR2) 



with 



cl, := wi (-/o/4) , c«, := wi (/o/4) , cl, := W2 (-/o/4) , cr^ := W2 (/o/4) 



(47) 



(48) 



are used (cp. Fig. 10) to get equations of motion in minimal representation. The 
information between the coordinate sets is transferred by the motion of the neutral 
fibre 



'/ 



/r (x) = (1 + e) / jtdx = jvs + (1 + e) xjts + wi (x) 7115 + W2 (x) ibs 



(49) 




Fig. 10 Global coordinates 



252 T. Schindler et al. 

parametrised by the Lagrangian coordinate x with 

wi := ^nWi + ^iW2, W2 := r]nWi + r]iW2. (50) 

This results in a transformation F (qi, Qg) = 0. One part can be solved analytically 
with respect to the internal coordinates: 

irt + irR [?« (6?Li + dR^) + ^i {dL2 + ^^Wa)] i^s 
irs = 



2 2 

[% {dLi + dR^) + r]^ {dL2 + dR2)\ i^s 



(51) 



and 



£= r(^r«-'f''L)-/t5-l, (52) 

'0 

iiQ = J- [vro - VLo - sin {(Psi ) (Pr2 -^L2)]- (53) 

'0 

A system of nonlinear equations 

Fi := (pso ^ + sm (^^J = 0, (54) 

F2 := (fsi ^ \ ^ = 0, (55) 

F3 := (PS2 2 ^ 2 ^ 

F4:=;6ij, -;6li -'?'«, +'?'L, =0, (57) 

F5 := y6ij2 - Pl2 - VRi + VL2 = 0, (58) 

Fe := ^n (c/jji - c?Li) + ^g {dR2 - dti) - {rR - iTl) ■ ins = 0, (59) 

F7 := r]f: (dR^ -dti) + % {dR2 - c^Lj) - ii^R - /Tl) • /bs = 0, (60) 

Fg := 2K.X/256 + 2d,, Jl/\6-CR, -cl, = 0, (61) 

Fg := 2fl„,/oVl024 + 2c„,,/oV64-CR, + cl, = 0, (62) 

Fio := 2fo„2/oV256 + 2d,,^ll/\6 - cr^ - cl^ = 0, (63) 

Fii := 2fl„2/oVl024 + 2c,,2/oV64 - cr^ + cl2 = (64) 

remains which can be solved with Newton's method using analytical JACOBIAN 
evaluations. The derivatives fulfill the relations 



Computing Time Reduction Possibilities in Multibody Dynamics 

=: JigQg, 



dqi . 
dqg 



df 



/ dq; \ . dQi .. 



dqg 
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(65) 

(66) 



with the expressions involving the internal coordinates Xbe of the implicit relation- 
ship being calculated by the chain rule 



3F dxbe 



_9F 
9qg' 



9xbe dqg 

dF d /dxbe\ ^ _ d^ f3F\ _ d_ ( 3f\ dxbe 
V dqg / df I 3qg I df I 3xbe ) dqg ' 



9Xbe df 



(67) 
(68) 



3.1.2 Equations of Motion 

The bending length 

"10/2 



lh:= 



-L 



la 1 2 



\ir'\Ax % (1 + 6)/o + 



1 



/ WjVVjdx + / WjWjdjc 

J-lc,l2 J- 



.Iq/2 
-Iq/2 



(69) 



contains second order terms concerning bending and so allows for geometric non- 
linear foreshortening. The corresponding strain is given by 



^b — ^0 ^ . 1 



/k}/2 ^ ^ rh)l2 

vvjVVjdx -I- / W2W2AX 
-lnl2 J- 



-I0/2 



(70) 



Considering mass conservation and at most quadratic elastic deformation terms 
yields for the gravitational, elastic and kinetic energy 

f'0/2 
Vg = -pRARjg- / /rdx 

J-lo/2 



= -pRARjg- 



ErAr 2 



//0/2 rk)l2 

widx/ns -I- / W2^x ihs 
-la 1 2 J- 



-Iq/2 
rh/2 



-Iq/2 



(71) 



ERh p'^." . n2 
I0 + —— / (wj -KyA Ax 

2 y-Zo/2 ^ ' 

+ —r— (w'2-'^2o) dx-F— — / /Codx, (72) 
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//o/2 _ rh)l'^ 

||/r||^d.T + /o / CDJA. 
-Zo/2 J-k)l2 



1 



(73) 



with initial curvatures ^i^ and ^2^ as well as the projection of the angular velocity 
on the local tangent cot = ipo — sin ((pi) <p2 neglecting angular bending dependencies 
in the kinetic energy terms. The standard parameters are the density pR, the cross- 
section Ar, Young's modulus Er, shear modulus Gr, the area moments of inertia 
Ii, 1 2, lo and the gravity jg. 

With the Lagrange II formalism 

it is possible, to derive the equations of motion. Because of 7" = 7" (qi, qi) it holds 



v9qi/ 9qf ' 9qi9qi 



df 



Hence, the mass matrix and the smooth right hand side are given by 



d^T .._fdTY (^{ye + Vg)Y d^T 



M,:=^. hi^=(^)"-| '\: ^M -.T^qi (76) 



9qf V^qi/ \ ''qi / 9qi9qi 



such that 



Miiii - hi = 0. (77) 

Globally the equations of motion fulfill 

jjMiJig qg - J.^ (hi - Mjjigqg) = 0. (78) 



3.1.3 Analysis of Instability 

When increasing the number of finite elements in an explicit integration scheme 
it has been recognized that Newton's method does not succeed in the solution 
of the transformation between internal and global coordinates although an analyti- 
cal Jacobian and double machine precision are used for the nonlinear equations' 
solver (cf. Sect. 3.1.1). A solution can be found by decreasing the integrator time 
step size or using a linear implicit scheme with the necessary evaluation of finite 
element JACOBIAN matrices of the right hand side being the drawback. To avoid 
this additional effort the problem is analysed by reduction using norm consistent 
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linearisation of the transformation in the x^ — y^ -plane. This changes the following 
equations from Sect. 3.1.1 



Fi := (fso 



VLq+VRq 



= 



(79) 



Fe := dR2 - c^Lj + i^R - ^l) sin (i^Sj) - iyR - yt) cos (^Sj) = . (80) 
F7 := dL, - dR^ - izR - zl) = (81) 

and yields a decoupling of the spatial motion. Finally, it results a nonlinear equation 



F (^52) '-{xr-xl) sin {(ps^) - iyR - yt) cos {tps^) 



+ 



64 
I? 



3/0 (ipL2 + VR 
64 



(^-^^^-,s.)- 



CLo + CR2 



= 



(82) 



in (ps2 comprising an affine equation and a superposed oscillation. Figure 1 1 shows 
this nonlinear function and its affine part in different settings. The solution of the 
last iteration and so the canonical starting value (marked with a cross) is always in 
the antinode around zero. If the integrator time step size is small enough (left figure) 
this yields a new solution (marked with a circle). Otherwise (right figure) e.g. when 
global input parameters (the axis intercept) blow up for increasing simulation time, 
the starting value for Newton's method would have to be shifted to the antinode 
of the expected solution. This instability depends on the oscillation projected to the 
abscissa and is typical with explicit integration schemes. The time step size for the 
explicit integration scheme At < 5 ■ 10~^ s is defined by the numerical stiffness of 
the ring package equations of motion and not hy At < 5 ■ 10~^ s resulting from the 
Nyquist-Shannon sampling theorem and 20,000 Hz being the polygonal frequency 
upper bound of a pushbelt CVT. The theoretically possible improvement factor 100 




Fig. 11 Zeros of tiie planar beam transformation root function 
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in practice reduces to a maximum realistic improvement factor depending on the 
computer architecture because of the additional effort per time step of linear implicit 
integration schemes. 



3.2 Parallel Computing Architectures 

Today, Moore's law states that the number of transistors on a standard processor 
doubles every 18 months. Based on empirical studies, Gordon Moore formulated 
this rule of thumb in 1 965 and proposed a validity period of about ten to 1 5 years in 
2007. This yields a profit in performance, which cannot any more achieved by higher 
clock rates due to technical limitations but by parallelism on CPU level with multi- 
core architectures. The consequences for software developers are the adaptation of 
existing programs and the design of new ones concerning these hardware trends. 

In the case of multibody systems normally one has no memory limits, such that 
a multiple instruction multiple data (MIMD) architecture with shared memory can 
be used. This is the field of the OpenMP interface [5] giving the possibility to 
simply extend a serial program with control structures for parallelisation. Then, the 
advantages can be measured by 

h 
speed-up s„ := — , (83) 

'« 

efficiency e^ := — (84) 

n 

whereby t„ is the run-time of the program on n processors. There exist several the- 
oretical estimates for the maximum achievable speed-up, 

1 1 

Amdahl's law s„ < ; — ^ — , (85) 

a + i^ a 

ts + ntp 

Gustafson's law Sn = — =^n. (86) 

t, + tp 

Both rules are based on different ideas. Amdahl assumes the serial code part a to 
be constant when considering one simulation model. This results in an upper bound 
for the speed-up because of administrative overhead when enlarging the number of 
processors. Gustafson looks at the simulation time of a parallel program on a single 
core machine with the sequential simulation time part ts and the parallel simulation 
time part tp. For an asymptotic consideration he assumes the sequential simulation 
time to relatively decrease if one uses more processors e.g. in the practically relevant 
applications of larger simulation models. Altogether, one has a more pessimistic rule 
of Amdahl and a more optimistic one of Gustafson available for comparison with 
experimental scaling measurements of a concrete program. 
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Fig. 12 Program flow 



Profiling yields that in the discussed multibody system context of Fig. 8 the main 
computational cost per time step is spent for the kinematic update of the rigid ele- 
ments and of the finite elements, potentially being used for the discretisation of the 
flexible belt. Both items are organised in loops from the software development point 
of view. This can be summarised by the formulas for the components of the global 
equations of motion 

M=^jfM,J, h=^jfh,-, W = ^jfW,-. (87) 



Thereby, the single update jobs of the summation loop are independent with respect 
to memory and computation; the final inserting into global memory space is the 
only critical task. According to Amdahl's law S4 < 2.05, €4 < 0.5 and according 
to Gustafson's law S4 < 2.2, €4 < 0.55 are expected for a four-core machine. 
The results in Fig. 12 show a program flow with parallelised updates, sequential 
summation, blocking of threads and not parallelised segments. A speed-up ^4 = 1.7 
has been achieved resulting in an efficiency factor £4 = 0.43. So although, e.g. 
contact iterations cannot be simplified conceptually, the result of parallelisation is 
quite promising but there should still be possibilities to improve the scaling of the 
parallel parts about 0.07—0.12 efficiency values in comparison with the theory and 
to extend parallelisation to here not parallelised parts. 



4 Conclusion 

Computing time reduction is important especially for holistic validation. In the 
context of simplified CVT models the present paper detects possible bottlenecks 
and proposes solution possibilities. In particular, the determination of an initial 
state, stabilising methods and parallelisation is outlined. Summarising, a good 
initial state saves about 0.2 s real simulation time, implicit integration schemes 
propose a maximum realistic speed-up depending on the computer architec- 
ture and parallelisation according to Amdahl's law has a maximum asymptotic 
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speed-up of 3.23 for the discussed examples. Concerning the last two issues with 
a currently attained speed-up 54 = 1.7 it is worth to further improve and extend 
parallelisation techniques in future work. 
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Optimization-Based Design of Minimum Phase 
Underactuated Multibody Systems 



Robert Seifried 



Abstract An underactuated multibody system has less control inputs than degrees 
of freedom, e.g. due to passive joints or body flexibility. The analysis of the mechan- 
ical design of these kind of underactuated multibody systems might show that they 
are non-minimum phase, i.e. they have an internal dynamic which is not asymptoti- 
cally stable. Therefore, feedback linearization is not possible, and also feed-forward 
control design for output trajectory tracking becomes a very challenging task. In this 
paper it is shown that through the use of an optimization procedure underactuated 
multibody systems can be designed in such a way that they are minimum phase. 
Thus feed-forward control design is significantly simplified and also feedback lin- 
earization of the underactuated multibody system is possible. 



1 Introduction 

Underactuated multibody systems possess less control inputs than degrees of 
freedom. Examples are multibody systems with body flexibility or passive joints. 
Due to the underactuation, the method of inverse dynamics known from fully ac- 
tuated systems, see e.g. [2, 20], cannot be used for the design of controllers for 
end-effector trajectory tracking. Thus in the case of underactuation the controller 
design is much more involved than in the fully actuated case. Thereby, for trajectory 
tracking of underactuated multibody systems generally advanced modern nonlinear 
control techniques are necessary. Using concepts from differential geometric control 
theory, see [8,14], the analysis of the mechanical design of underactuated multibody 
systems might show that they possess an internal dynamics. The internal dynam- 
ics under the constraint of constant zero output is called zero-dynamics. Systems 
with asymptotically stable zero-dynamics are called minimum phase, otherwise 
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non-minimum phase. For minimum phase systems feedback linearization, a 
nonlinear control technique well-suited for output trajectory tracking, is possi- 
ble, see [4, 8, 14, 18]. Also the design of feed-forward control for minimum phase 
systems is rather straight forward. Conversely, for non-minimum phase systems 
feedback linearization is not possible. Also feed-forward control design is in this 
case much more complex and requires the numerical solution of a two-sided 
boundary value problem, see [6, 17, 21]. Therefore, the aim should be to design 
an underactuated multibody system in such a way that it is minimum phase. Thus 
already in the early state of the design process, mechanical design and control 
design should be considered concurrently. In [1] such a methodology is used to 
design differentially flat underactuated planar manipulators by using a special mass 
distribution. In this case, no internal dynamics remains and full-state linearization 
is possible; however the approach might require the use of larger counterweights. 

In this paper, an optimization-based design procedure is proposed in order to 
design underactuated multibody systems in such a way that the internal dynamics 
becomes stable. In the proposed optimization procedure, the design parameters are 
the mass distribution of the multibody system. This can e.g. be achieved by addi- 
tional masses which are added to defined locations of the multibody system. The 
optimization criteria is two-stage and firstly requires that all eigenvalues of the lin- 
earized zero-dynamics are in the left half-plane, and secondly that initial errors in 
the zero-dynamics decay rapidly. The analysis of this optimization problem shows 
that there are many local minima, and therefore a particle swarm optimization pro- 
cedure is used. It is shown that minimum phase property can be achieved with only 
a modest increase of the total mass of the underactuated multibody system. The ef- 
ficiency of this optimization-based design approach is demonstrated by simulations 
for manipulators with one and two degrees of underactuation, respectively. 



2 Trajectory Tracking Control 

Underactuated multibody systems with / degrees of freedom, generalized coordi- 
nates q e M^ and inputs u e IR"' with m < /, i.e. control forces and torques, are 
considered. The nonlinear equation of motion is given by 

M{q)q+kiq,q) = g(q,q) + Biq)u, (1) 

where M is the mass matrix, k the vector of generalized gyroscopic and centrifu- 
gal forces and g the vector of applied forces. The input matrix B distributes the 
control inputs u onto the directions of the generalized coordinates. In the case of 
an underactuated multibody system the input matrix B cannot be inverted and the 
classical approach of inverse dynamics cannot be used. Thus, more advanced non- 
linear control techniques are necessary. In the following feedback linearization and 
feed-forward control design for output trajectory tracking of underactuated multi- 
body systems is presented. These approaches are based on concepts from differential 
geometry and its theoretical background is described in [8, 14, 18]. 
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2.1 Input-Output Normal-Form 

The nonlinear input-output normal-form is the basis for feedback linearization as 
well as for feed-forward control design. This input-output normal-form is obtained 
by applying a coordinate transformation to the equation of motion. This diffeomor- 
phic coordinate transformation is given by z = 0(x), where x are the original 
coordinates and z are the coordinates of the input-output normal-form. A local dif- 

feomorphic coordinate transformation exists if the Jacobian-matrix J = ^^ ' is 
nonsingular. In general this transformation requires a state-space representation of 
the nonlinear system and the symbolic computation of Lie-derivatives of the out- 
put y . However, even for multibody systems with very few degrees of freedom, 
these symbolic calculations become very complicated. Therefore, in the following 
it is shown, that for a special type of system output y the nonlinear input-output 
normal-form can be directly derived from the second order differential equation of 
motion (1). In a first step the equation of motion (1) is partitioned into two parts, 

Maaiq) Mau{q)\ pal ^ \ka{q,q)\ ^ pa(?:^)1 ^ \Ba(q)\^ ^2) 
MIM) Muu{q)\\_qu\ \_ku{q,q)\ Uuiq^q)] L^«(?)J 

Thereby the submatrix Ba e jr'"^'" has rank m. The first m rows of the partitioned 
equation of motion (2) are referred to as actuated part associated with the m actuated 
coordinates q^^. The remaining f — m rows are referred to as the un-actuated part 
associated with the / —m un-actuated coordinates q^,. In the following, it is assumed 
that Ba = /is the identity matrix and B„ = 0. These special choices represent 
interesting cases of underactuated multibody systems in tree structure. Examples 
include rigid multibody systems with passive joints and planar elastic manipulators, 
where the shape functions of the elastic bodies are chosen according to clamped 
boundary conditions, see e.g. [5]. 

The nonlinear input-output normal-form depends on the choice of the system 
output y . Here it is assumed that the end-effector position can be approximately 
described by an output of form 

y = qa + rq,„ (3) 

where F e M'"^^" . This output is a linear combination of actuated and un-actuated 
generalized coordinates. For example such an output can be used to describe the 
end-effector position of elastic manipulators as shown in [5]. The partitioned equa- 
tion of motion (2) is now transformed into the input-output normal-form with new 
coordinates y.q^,, see e.g. [17] for details. The input-output normal-form of the 
underactuated multibody system with the system output given by (3) reads 

My =g-k+u, (4) 

[M..,-Mlr]q,, = g_^-k,.-MlM-\g-l^+u]. (5) 
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luAu 



internal dynamics 
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Fig. 1 Graphical representation of the input-output normal-form of underactuated MBS 

In this nonlinear input-output normal-form, the terms are summarized according to 
the following convention: 

M = Maa- (Ma. - Maar){M„, - MI^P)'' M^^, 
g=ga- (Mau - Maar){M,, - Ml,r)~'g^, 
lc=ka- (Mau - Maar){M..„ - Ml,r)~'k.,. 

Equation (4) has dimension m and describes the relationship between the input u 
and output y. Since this is a second order differential equation for the output y 
the considered multibody systems have vector relative degree r = {ri, ... , r^} = 
{2, ... , 2}. The second part of the normal-form, given by (5) has dimension f — m 
and describes the so-called internal dynamics. A graphical representation of the 
nonlinear input-output normal-form of the underactuated multibody system with 
the system output given by (3) is shown in Fig. 1. The typical state-space rep- 
resentation of the input-output normal-form is omitted here, since analysis of 
the zero-dynamics, feedback linearization and feed-forward control design can be 
directly performed from the second order differential equations (4) and (5). 



2.2 Analysis of the Internal Dynamics 

The analysis of the stability of the internal dynamics is crucial for control de- 
sign. Since this analysis is often quite complex the concept of zero-dynamics is 
used in drawing important conclusions about the stability of the internal dynamics. 
The zero-dynamics is the internal dynamics under the constraint that the output is 
kept identically zero, i.e. j = 0, V f . For the considered underactuated multibody 
systems the required control input for this task follows from (4) of the nonlinear 
input-output normal-form as 



Uo=k{(i,q„(i,qJ-g{0,q„0,qJ. 



(6) 
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Applying this input mq to (5), the internal dynamics reduces to the zero-dynamics 
of the underactuated multibody system and reads 

[M,„,(0, q,) - Mr„(0, q,)r]q, = g„(0, q„ 0, ?„) - *»(0, 9„, 0, ^„). (7) 

As shown in [8, 14] a nonlinear system is called asymptotically (exponentially) 
minimum phase if the equilibrium point of the zero-dynamics is asymptotically (ex- 
ponentially) stable. Otherwise, the system is called non-minimum phase. It should 
be noted, that the minimum phase property is independent of the choice of co- 
ordinates, and thus is invariant under a diffeomorphic coordinate transformation 
z = ^(x). However, the minimum phase property depends on the system dynamics 
given by the equation of motion (1) and the choice of the system output y. 



2.3 Feedback Linearization 

Feedback linearization is based on the presentation of the nonlinear system in input- 
output normal-form. Then, the nonlinearities are cancelled using state-feedback, 
resulting in an exactly linearized system or subsystem, see [4, 8, 14, 18]. This 
approach is fundamentally different from Jacobian-linearization, in which the non- 
linear system is approximated by a linear system. The nonlinearities in (4) can be 
cancelled by the linearizing feedback control law 

u = Miy,qJv + k(y,q^,,y,q,,)-J{y,q^„y,qJ, (8) 

where visa new input. It should be noted, that this linearizing feedback law depends 
on all states of the transformed system, i.e. y, y.q^,, q^,. Applying the linearizing 
feedback law (8) to the input-output normal-form (4) and (5) yields the input-output 
linccirized system 

y=v, (9) 

[M„„ - Mlr]q, = g„ - k, - Mlv. (10) 

The system consists of two subsystems. The first subsystem describes the linear 
relationship between the new input v and the output y and consists of m chains of 
two integrators. Therefore, this approach is also called input-output linearization. 
The second subsystem, resulting from the internal dynamics, is in general nonlinear. 
From (9) and (10) it is seen that only the first subsystem influences the output. Thus 
the feedback law (8) renders the states q^, ^„ of the internal dynamics unobservable. 
Since the first subsystem (9) is in canonical controllable form one can use linear 
control methods, such as eigenvalue assignment, to design with the new input v a 
feedback controller which influences the output j in a desired way. In this paper 
the control goal is trajectory tracking of a desired system output J = Jj, which 
is given by (3). Following [8, 14] the new input v in (9) might be used in order 
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to achieve asymptotic output trajectory tracking. The investigated underactuated 
multibody systems have vector relative degree r = {ri, . . . ,rm} = {2, . . . , 2} 
and therefore the tracking control law reads 

v = yd + Pi(yd-y) + Poiyd-y)- (H) 

Thereby, the coefficients Po. Pi are diagonal matrices. In the special case of a con- 
stant reference trajectory, j = 0, Vf, the tracking control law (11) reduces to a 
control law for stabilization around a stationary point. Introducing the output trajec- 
tory error e = y j—y and applying control law (1 1) to the linearized subsystem (9) 
yields the linear error dynamics 

e + p^e + p^e ={S. (12) 

From this follows that the diagonal matrices po- P\ can be used to place the eigen- 
values of the error dynamics in the in the left half-plane. Then, due to a suitable 
choice of Pq, p^ the system output converges to the desired reference trajectory. 
However, the task of controller design is not only to influence the output in a de- 
sired way, but also to achieve that the whole dynamics of the system behaves well. 
Thus, the control design given by control law (1 1) is only valid, if the unobservable 
states 5„, ^„ of the internal dynamics remain bounded. 

An inspection of the internal dynamics (10) under the control law (11) shows, 
that it can be viewed as a nonlinear time-varying system driven by the desired out- 
put trajectory j j. First of all, the special case of tracking a constant output j = 
has to be considered. In this case the tracking control law (11) coincides with a sta- 
bilizing control law and leads to the requirement that the zero-dynamics has to be 
asymptotically stable, i.e. the nonlinear system is minimum phase, see [8, 14, 18]. 
In the case of trajectory tracking, this is initially a rather weak condition, however 
from a practical point of view it is the crucial point for the analysis of the behavior 
of the internal dynamics. For a non-constant desired output trajectory jj additional 
conditions exist which strengthen the requirement of minimum phase, see [8, 14, 1 8] 
for details. For example in [18] it is shown that for exponentially minimum phase 
nonlinear systems the desired trajectory jj and its first r,; — 1 derivatives must be 
small enough in order to guarantee that control law (11) yields to convergence of 
the tracking error e and bounded internal states q^, q^. 

The presented control structure consists of an inner and an outer loop, and is 
shown schematically in Fig. 2. In the inner loop exact linearization is achieved by 
using state-feedback law (8). The outer loop is used for eigenvalue assignment of 
the error dynamics (12) by control law (11). 



2.4 Feed-forward Control Design 

The feed-forward control design is based on an inverse model which provides the 
input Ud required for exact reproduction of a desired output trajectory y = y d- 
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Fig. 3 Graphical representation of feed-forward control of an underactuated MBS 



In order to account for small disturbances and uncertainties the feed-forward con- 
trol has to be supplemented by additional feedback control. This yields a so-called 
control structure with two design degrees of freedom. Both parts of this control sys- 
tem can be designed largely independent from each other The input hj computed 
by the feed-forward control follows from (4) as 



Ud =M(y^,qJyj-qiy^,q,„y^,qJ + k{yj,q^,,y^,qJ. 



(13) 



The computation of the input hj depends on the desired output y d^y d ^^'^ ^^ un- 
actuated states ^„, ^„. These latter ones are the solution of the internal dynamics (5) 
which is driven by Jj, jj and mj. Replacing hj in the internal dynamics (5) by 
(13) yields for the un-actuated states q^^, q^^ the differential equation 

[M„„(jrf,9„)-Mj„(jd,?„)r]^„ 
= 8uiyd^9u^yd'ku)-kuiyd'9u-yd'ku)-Ml,Xyd^qu)yd- (i4) 

In summary, the inverse model consists of three parts which are shown schemat- 
ically in Fig. 3. The first part represents a chain of two differentiators for the desired 
output vector j^, producing the values j ^ and j j. The second part of the inverse 
model is the driven internal dynamics (14) for the ^„ coordinates. The third part of 
the inverse model is the algebraic equation (13) which computes from these values 
the desired input « j . 
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Several methods for model inversion exist which differ in the solution of the 
internal dynamics (14). In classical model inversion [7] the ^„,^„ variables are 
found through forward integration of the internal dynamic (14) from the start- 
ing time point to to the final time point tf, using the initial values q„(to) = 
luQ' 4u(h) = 9«o- However, in order to use the input mj in a feed-forward con- 
trol, it must be bounded. Thus depending on the stability of internal dynamics 
forward integration of the internal dynamics might yield unbounded 9„,^„ val- 
ues and thus unbounded inputs « j . Therefore, classical inversion can only be used 
for feed-forward control design if the internal dynamics (14) remains bounded, 
which implies that only minimum phase systems can be treated. In the case of non- 
minimum phase systems a bounded feed-forward control can be computed by stable 
inversion as described in [6, 17, 21]. However, in this approach the internal dynam- 
ics (14) is solved as a two-sided boundary value problem. This yields a non-causal 
solution, i.e. pre- and post-actuation is necessary. In general the solution of this 
boundary value problem must be pre-computed for each desired output trajectory, 
e.g. by using finite differences as proposed in [21]. 

An alternative to the presented approach for feed-forward control design is pro- 
posed in [3] for flat mechanical systems. Thereby the inversion problem is solved by 
deriving a set of differential-algebraic equations which is then solved numerically. 
A similar approach is used in [12] for minimum phase flexible manipulators. 



3 Design of Stable Zero-Dynamics 

The analysis of the initial design of an underactuated multibody system might show 
that it possess an unstable zero-dynamics. Due to the previously discussed short- 
comings and difficulties in trajectory control of non-minimum phase systems, it is 
desired to design the multibody system in such a way that the zero-dynamics is 
stable. Then feedback linearizable is possible and also the design of feed-forward 
control is significantly simplified. As shown in Sect. 2.2, the zero-dynamics depends 
on the choice of the system output y and the equation of motion of the multibody 
system. Output relocation is a method where a different system output y is chosen in 
order to achieve minimum phase property. However, the use of this approach is lim- 
ited if trajectory tracking of an end-effector point is aspired. Thus, minimum phase 
property can only be achieved by modifying the system dynamics, which means the 
mechanical design of the underactuated multibody system must be altered. 



3.1 Identification of Possible Design Parameters 

In the following physical parameters which influence the stability properties of the 
zero-dynamics are identified. For this exemplary investigation, a single rotational 
arm is considered which consists out of two links connected by one active and one 
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Fig. 4 Rotational arm with 

one active and one passive 
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passive joint. The rotational arm is shown in Fig. 4, whereby the center of mass of 
link ; = 1,2 is denoted by C/. The links have length /,, mass m,, inertia /,; and 
the position of the center of mass is described by s, . An input torque Ti acts on 
link 1. Link 2 is connected by a passive joint to link 1, which is supported by a 
spring-damper combination with spring constant c and damping coefficient d . The 
arm is described by the generalized coordinates a and fi, whereby P denotes the 
un-actuated coordinate. The arms moves perpendicular to the direction of gravity. 

For this investigation the system output of the rotational arm is given by the linear 
combination y = a + Ffi. In this case the zero-dynamics of the rotational arm reads 



•2 



[{l-r){l2,, + m2sl)-hm2S2rcosp]p = -c^-dp-hm2S2r^P sin^S. (15) 



This shows that the zero-dynamics of the rotational arm is influenced by the length /i 
of the first link, the value F of the output, the coefficients c, d of the spring-damper 
combination and the mass distribution of the second link, given by its mass m2, 
inertia l2z and center of mass S2- For a further analysis the linearized zero-dynamics 
around the equilibrium point ;S = is considered, 

[(1 - nih, + m2sl) - hm2S2r]p +dp+cP= flzl + a J + aoP = 0, (16) 



where a2,ai,ao correspond to the coefficients of the characteristic polynomial. 
Thus, the linearized zero-dynamics of the rotational arm is only asymptotically sta- 
ble if all coefficients 02, ai . ao have the same sign and are non-zero, see e.g. [13]. 
Since the constants c,d of the spring-damper combination are by nature positive 
also the coefficient 122 has to be positive. Thus, in this case c, d can only be used to 
shape the dynamic response of the zero-dynamics, but cannot be used to change its 
stability property. The factor F is assumed to be fixed in order to give a suitable ap- 
proximation of the end-effector position. Also in the following it is assumed that the 
arm topology is fixed, i.e. the length of the links cannot be changed. Consequently 
only the mass distribution of the second link remains as design variable to alter 
the stability properties of the zero-dynamics of the rotational arm. This analysis is 
representative for the systems considered in the following. 

The mass distribution of the un-actuated link, given by its mass m2, inertia I2Z 
and center of mass S2, could be used directly as design variables. However, these 
quantities are coupled, and the optimization might yield values which cannot be real- 
ized from an engineering point of view. Therefore, in the following approach a basic 
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Fig. 5 Four possible design variants for altering the mass distribution of a link 



homogeneous initial design of the underactuated multibody system is assumed. 
Then, in the case of a non-minimum phase initial design an additional small bal- 
ancing mass rria is added to each un-actuated link in order to alter the stability 
property of the zero-dynamics, thus achieving minimum phase behavior. The ad- 
ditional mass rua is added to the location Sa on the link which changes the center 
of mass of the combined body. To gain a higher flexibility the additional mass iria 
might be added as counterweight to the link. An additional increase of the inertia 
I27 of the un-actuated link can be achieved if the mass rria is located by an offset da 
away from the axis of the link. Four possible design variants for an un-actuated link 
are shown schematically in Fig. 5. 

In summary, the three design variables p = [ma,Sa, da] can be used to alter the 
mass distribution of an un-actuated link and yield 

- , '«2^2 + maSa 

m2 = m2 + ma, S2 = , 

'"2 (17) 

hz = hz + m2{S2 - S2)^ + ma(Sa - S2)^ + niada ■ 

Thereby /«2, Iiz^ ^2 denote the values of the initial design of the un-actuated link. 
In order to obtain a viable physical design, bounds have to be put on the design 
variables which results in the feasible design space for one un-actuated link 

P={p€lR^\0< ma < ma,„„,-, Sa„,„ < Sa < ^«,„„,, < da < 4,„„J. (18) 

For underactuated multibody systems with several passive joints, the design vari- 
ables p and the feasible design space P are the collection of the design variables of 
all un-actuated links. 



3.2 Optimization Criteria 

For this design process an optimization procedure is proposed using the previously 
identified design parameters to modify the mass distribution of the un-actuated 
bodies. The primary design goal is to achieve a stable zero-dynamics, such that 
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the underactuated multibody system is feedback linearizable. The zero-dynamics 
is given by (7) and depends only on the un-actuated states q^^.q,, and the design 
variables p. Therefor, the zero-dynamics can be written as 

[M„„(/', qj - Ml^,(p, 9„)r]^„ = gSp, ?„, ^„) - kuip, 9„, 9„), (19) 

and the system matrix of the linearized zero-dynamics (19) is denoted by A (p). 

In order to obtain a powerful mechanical design, not only minimum phase be- 
havior must be guaranteed, but also additional goals must be achieved by the design 
process. Firstly, the design should be robust to uncertainties in the mass distribution 
of the un-actuated bodies. This mean, the system should remain its minimum phase 
property even if in the physical construction there are small unknown variations. 
These can be either in the initial mass distribution or in the optimal design vari- 
ables. Secondly, the zero-dynamics should not only be stable, but also disturbances 
should decay rapidly. This is especially important in order to avoid that disturbances 
yield large undesired vibrations of the internal dynamics during trajectory tracking. 
Therefore, a two-stage computation of the optimization criteria f{p) is proposed, 
which should be minimized in the course of the optimization; 

1. In the first step, Lyapunov's indirect method is used, see [1 1]. It requires that all 
eigenvalues of the linearized zero-dynamics are in the left half-plane, 

Re[X(A(p))]<0. (20) 

In order to achieve robustness against uncertainty in the mass distribution of the 
un-actuated bodies it is also desired that the eigenvalues for several perturbation 
parameter sets p + Ap are in the left half-plane, 

Re[X{A(p + Ap))] < 0. (21) 

In this paper 16 designs with perturbation are tested, whereby the design parame- 
ters are varied in different combination by up to 5% around the nominal values p. 
Thus a safety region is created around the nominal design p. If at least one eigen- 
value of the nominal or the designs with perturbation has a non-negative real part, 
a large default value for the optimization criteria f(p) is returned. Otherwise, 
the linearized analysis shows asymptotic stability and it is proceeded with step 2. 
It should be noted, that her only a point-wise robustness test is performed. In 
order to guarantee robustness over the entire region of uncertainties this point- 
wise test can be replaced by a /^-analysis, see e.g. [19] for details on analysis of 
system robustness. 

2. If all eigenvalues are in the left half-plane, the final optimization criteria f{p) 
is calculated. In order to achieve good damping properties it is required that ini- 
tial errors in the nonlinear zero-dynamics (19) decay rapidly. The disturbance is 
given by the initial conditions ^„(?o) = ?«o>?j,(^o) = quo- The optimization 



272 R. Seifried 

criteria /(p) is then described by the cumulated squared error of the f — m 
un-actuated coordinates 9,, in respect to the equilibrium point ^„ = of the 
zero-dynamics. This is given by, 

f-m 'j. 

fip) = E / 4*' (22) 

where f 1 describes the final time of the simulation. Besides evaluating the damp- 
ing properties of the zero-dynamics, this second step in the criteria computation 
provides also a very good indication about the behavior of the nonlinear zero- 
dynamics. It gives an indication if the zero-dynamics remains stable in the case 
that the internal states are pushed by a disturbance further away from the equi- 
librium point. In order to achieve a good trade-off between the damping property 
and avoiding large mass increases the optimization criteria (22) can be extended 
in the sense of weighted criteria to 

f-m 'j. f-m 

fiP)= E hldt + wj^f^a,^ (23) 



'=1 ;;, '=1 



where w is a weighting factor. 



3.3 Particle Swarm Optimization 

In the optimization procedure the criteria function f(p) should be minimized with 
respect to the design variables p. Due to the two-stage criteria computation, the op- 
timization problem is discontinuous. Also an analysis of this optimization problem 
shows, that there are many local minima, often surrounded by areas of instability 
of the zero-dynamics. The complexity of the topology of the optimization criteria 
increases with the number of passive joints. Therefore, gradient based optimization 
algorithms cannot be used and stochastic optimization algorithms must be deployed. 
Here a particle swarm optimization procedure is used. This is a population based op- 
timization method which originates in the study and simulation of social behavior 
of bird and fish flocks, see [9]. The basic idea is the modeling of social interaction 
between individual particles of a population on the quest for the best point in the fea- 
sible design space. Thereby, it is aspired to use the collective intelligence of a swarm 
to solve complex optimization problems. A detailed analysis of swarm intelligence 
is given in [10]. 

As advantages of the particle swarm optimization it should be named that no 
gradient information is necessary, the solution is independent of initial sets of design 
parameters p, and there are no requirements on smoothness or continuity of the 
optimization criteria. This approach is well suited for finding global minima and 
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is often easy to program and to adjust to specific problems. The used algorithm is 
a Matlab implementation presented in [15, 16], and has been already successfully 
applied in the optimization of multibody systems. 

Compared to gradient based methods, a general disadvantage of stochastic opti- 
mization algorithms is their large computational expense due to a large amount of 
criteria evaluations. In the criteria computation the far most time-consuming part is 
the time-integration of the zero-dynamics in the second stage. However, in the first 
stage of the criteria computation many unfeasible designs are filtered out and thus 
the number of time integrations is heavily reduced by the restriction on locally sta- 
ble designs. In summary, the two-stage type of criteria calculation in combination 
with particle swarm optimization is an efficient way to design a stable and robust 
zero-dynamics which also shows good damping properties. 



4 Application Examples 

In this section the efficiency of the optimization-based design approach for un- 
deractuated multibody systems is presented using simulations of two application 
examples. These two examples are planar underactuated manipulators with a kine- 
matic redundancy and one and two passive joints, respectively. In the first example 
the advantages of the presented optimization-based design and control approach is 
compared to other possible control strategies and its robustness is demonstrated. 
The second example shows different designs which can be achieved in an efficient 
way by this optimization-based design approach. 



4.1 Manipulator with One Passive Joint 

The first investigated underactuated manipulator has one passive joint and is 
shown schematically in Fig. 6. The manipulator moves along the horizontal plane 
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Fig. 6 Underactuated manipulator with one passive joint 
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Table 1 Initial parameters for underactuated manipulator with one 
passive joint 



Cart 


m, = 3kg 








Arm 1 


m, = 6.S15kg 


/i 


= 0.5743^ 


h = 1.0m 


Arm 2 


m, = 3.438A:g 


h 


= 0.0723^ 


h = 0.5m 


Arm 3 


mi = 3A3Skg 


h 


= 0.0723^ 


h = 0.5m 




^ = 50^ 


d 


= 0.25^ 

rad 





and consists of a cart on which a chain of three arms is mounted. The homogenous 
arms have length /i and I2 = h- The manipulator is described by the generalized co- 
ordinate q = {x,ai.a2- P)^ and is actuated by the control input u = (F. Ti , 72)^. 
The third arm is connected by a passive joint to arm 2 which is supported by a par- 
allel spring-damper combination with spring constant c and damping coefficient d. 
The initial physical parameters of the manipulator are summarized in Table 1 . 

The control goal is to force the end effector to follow a predefined trajectory as 
closely as possible. For a somewhat stiff spring-damper combination, the angle P 
remains small. Then, the end-effector position can be approximated by 



rEF = 



X + li sin(Q;i) -|- I2 sin(Q!l -|- al) + It, sin(Q!i + aj + P) 
—l\ cos(ai) — I2 cos(q!1 -h a2) — I3 cos(q!i + aj + P) 

X + h sin(o;i) + {h + h) sin(Q!i +a2 + PP) 

-h COS(Q!i) - (I2 + h) COS(Q!i -I- Q!2 -I- P P) 



which can be described by the linearly combined output 



y = {yi,y2,y-i) = {x,ai,a2 + rp) . 

From geometrical consideration it turns out that the factor F 



(24) 
(25) 

(26) 



h 
h+h 



0.5 pro- 
vides a good approximation of the position of the end-effector point as long as P 
remains small. This choice of output is motivated from control of elastic manipu- 
lators, see [5]. However, it should be noted, that due to this approximation a small 
tracking error for the end-effector position has to be expected. 

The analysis of the zero-dynamics of this manipulator with output y shows that 
it is identical to the zero-dynamics of the rotational arm presented in Sect. 3.1 
and is given by (15). With the physical parameters given in Table 1 the system 
is non-minimum phase. In order to achieve feedback linearizability the presented 
optimization procedure is used. As discussed in Sect. 3.1 the design parameters 
are an additional mass nia and its location Sa and offset da which is added to 
the un-actuated arm 3. The bounds of the optimization parameters are summa- 
rized in Table 2 and are chosen in such a way that a viable mechanical design 
can be achieved. The optimization criteria given by (23) with ?i = 0.25 i' and 
w = 0.75 10"'* is chosen. The result of this optimization is also given in Table 2. 
The optimization result shows, that the required additional mass is only 0.531 kg. 
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Table 2 Optimization Parameters m„ (kg) s„ (m) d„ (m) 

parameters-bounds and ; ; ; — TTH — 7. 

^ ,. . ^. „ Lower bound -0.333 
optimization result 

Upper bound 1 0.333 0.167 

Opt. result 0.531 -0.333 0.167 



which is a small increase of 3.1% compared to the total mass of the initial design. 
The additional mass is mounted with an offset as counterweight to the un-actuated 
link, corresponding to design variant d shown in Fig. 5. The computed values for 
distance Sa and offset da are on the bounds of the design space. Thus, a further 
improvement might be achieved be increasing these bounds. 



4.1.1 System Without Disturbances and Uncertainties 

The obtained optimal design is tested considering a half-circular end-effector tra- 
jectory. The center of the half-circle is at position (0, — 1 .5 m) and the radius is \m. 
The end-effector point should follow the trajectory in the short time period of 1.5 s, 
which describes an aggressive manoeuver. Also the kinematic redundancy should 
be used to perform a secondary task, which is moving the cart from starting position 
— 1 m to the final position 1 m. 

The simulation results of the end-effector trajectory for the manipulator with 
optimized mass distribution in combination with feedback linearization is presented 
in Fig. 7. These results are compared to those using two alternative control concepts 
applied to the initial design of the manipulator. The first alternative control approach 
is a stable inversion based feed-forward control of the non-minimum phase initial 
system combined with a time-variant Linear Quadratic Regulator (LQR). The stable 
inversion yields a pre-and post-actuation phase and its computation is described 
for this manipulator in [17]. The second alternative control approach is a feedback 
linearization, whereby minimum phase property of the initial system is achieved 
by output relocation. An analysis of the zero-dynamics shows that in this example 
output relocation requires F < 0.4. In the presented example F = 0.4 is chosen. 

The simulations show, that the best results are achieved by the design with opti- 
mized mass distribution of the un-actuated arm. Hereby the errors of the end-effector 
point trajectory are the smallest of the three approaches. In this case the maximum 
error of the end-effector trajectory is 2.1 mm. This is better than the results achieved 
with the stable inversion approach, which yields a maximum error of about 3.2 mm. 
The output relocation approach yields the worst performance with a maximum end- 
effector trajectory error of 28 mm. Since in all three cases there are no disturbances 
the output y is tracked exactly. The end-effector point errors occur due to the ap- 
proximation of the end-effector point by (25). It turns out, that due to the changed 
mass distribution of the un-actuated arm in the optimized design, the fi coordi- 
nate remains smaller than in the case of the stable inversion of the initial system. 
Thus the approximation given by (25) is even better, and yields as a side effect a 
smaller trajectory error for the optimized design. The output relocation yields an 
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Fig. 7 End-effector point trajectory for manipulator with one passive joint 



unsatisfying performance since with F = 0.4 only an insufficient approximation of 
the end-effector point can be achieved. This approach might be useful in the case 
of stabilization but not in the case of trajectory tracking and is therefore not further 
considered. 

The control inputs for the feedback linearization of the optimized manipulator 
and for the stable inversion of the initial design are presented in Fig. 8. A small mass 
increase occurs due to the additional mass in the optimization. Thus, the energy con- 
sumption when tracking this trajectory increases by approximately 5% compared to 
the stable inversion of the initial system. However, feedback linearization has several 
significant advantages compared to stable inversion. Feedback linearization yields 
an algebraic control law which is relatively easy to implement and independent of 
the desired output trajectory. In contrast, stable inversion of a non-minimum phase 
system has to be computed off-line for each desired trajectory separately by the nu- 
merical solution of a two-sided boundary value problem, see e.g. [6, 17,21]. Also the 
time-variant LQR requires the numerical solution of the differential Riccati equa- 
tion which also has to be performed off-line for each output trajectory. Finally, it has 
to be noted, that stable inversion of a non-minimum phase system yields a so-called 
pre-actuation phase. In the case of a kinematic redundancy, this pre-actuation phase 
might be avoidable as proposed in [17], however this goes along with an increase of 
the consumed energy. 
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Fig. 8 Control inputs for stable inversion + LQR (left) and feedback linearization of optimized 
system (right) 



4.1.2 System Under Disturbances and Uncertainties 

The results presented in the previous section where obtained by simulation of an 
ideal system. However, in reality there are always disturbances and uncertainties 
which might deteriorate the performance of the control strategy. Therefore, simu- 
lations are performed where the total mass, inertia and the center of mass of the 
un-actuated arm is increased by 5%, the stiffness of the spring is increased by 15% 
and the damping is set to J =0. Also measurement noise is considered, which is 
added as white noise to the generalized coordinates. The simulation results for the 
end-effector trajectory using the feedback linearization of the optimized minimum 
phase system and for stable inversion with LQR of the initial non-minimum phase 
design are shown in Fig. 9. The corresponding control inputs are shown in Fig. 10. 

The results show, that both strategies still work very well under these uncer- 
tainties and disturbances. However, the maximum trajectory tracking error using 
the feedback linearization is with 3.9 mm about 42% smaller than in the case of 
stable inversion which is 6.7 mm. The weighting matrices of the LQR are chosen 
in such a way that under these specific parameter uncertainties a good perfor- 
mance is achieved and the noise in the control inputs is in the same magnitude 
as in case of feedback linearization, see Fig. 10. This shows, that the proposed 
optimization-based design procedure yields a robust design in which the internal 
dynamics remains stable under these uncertainties and disturbances, and thus feed- 
back linearization yields satisfying results. 



4.2 Manipulator with Two Passive Joints 



The second investigated underactuated manipulator is shown schematically in 
Fig. 11. It is similar to the first example, however it has two passive joints and 
an additional load is added to the end-effector point. The actuated generalized 
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Fig. 9 End-effector point trajectory under disturbances for manipulator with one passive joint 
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Fig. 10 Control inputs for stable inversion + LQR (left) and feedback linearization of optimized 
system (right) under disturbances 
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Fig. 11 Underactuated manipulator with two passive joints and end-point load 
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Table 3 Initial parameters for underactuated manipulator with two 
passive joints 



Cart 


m, = 3 kg 








Arm 1 


m, = 6.875 /tg 


/, =0.5743^ 


h 


= 1.0 m 


Arm 2 


1712 = 2.292 kg 


h = 0.0217^ 


h 


= 0.333 m 


Arm 3 


mj = 2.292 kg 


/j = 0.0217^ 


h 


= 0.333 m 


Arm 4 


m^ = 2.292 kg 


74 = 0.0217^ 


U 


= 0.333 m 


Load 


mi = 6 kg 

c = 4oo;^ 


/, =0.0147^ 
^=0.25^ 







coordinates are ^^ = (x, ai , 0(2)^ and the un-actuated generalized coordinates are 
9u — iPi' Pi)^ ■ The physical properties are summarized in Table 3. In order to 
approximate the end-effector point a linearly combined system output is chosen as 



"0 


01 








La 


aJ 



y =qa+ rq, with r = . (27) 



In this example a good approximation is achieved for Fi = 2/3 and Fj = 1/3. 
However, this yields a non-minimum phase initial design, and the proposed opti- 
mization procedure is used to design a minimum phase manipulator. 

An analysis of the zero-dynamics of this manipulator shows, that the optimiza- 
tion problem is more complex than in the first example. Many local minima with 
similar criteria value exit and the influence of the mass distribution of the two un- 
actuated arms on the zero-dynamics is coupled. From a practical point of view the 
primary goal is to find in an efficient way a viable design. This does not have to 
be necessarily the global optima, but a local optima with very good performance is 
sufficient. Therefore three cases are considered using different bounds and optimiza- 
tion criteria. For each case 1 optimization runs are performed which are terminated 
after 120.?. 

Table 4 shows for Case 1 the bounds on the parameters and the results using the 
optimization criteria given by (22). Case 2 uses the optimization criteria given by 
(23) and the weighting factor w = 1.5 x 10""*. The bounds and results for Case 2 
are summarized in Table 5. Table 6 presents the results of Case 3 where the cri- 
teria given by (22) is used, however no offset da is allowed. For each parameter 
set the value of the optimization criteria / is given in the tables. Also the maxi- 
mum trajectory error emax and the increase AE of the energy consumption is given 
in comparison to the solution of the stable inversion approach of the initial non- 
minimum phase design for the previously presented half-circular trajectory. Thereby 
the trajectory error emax and the increase of consumed energy AE are computed 
from simulations of the optimized designs. 
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Table 4 Case 1 - bounds and optimization results 





'«o, 


^fl, 


da I 


r"r,2 


^fl. 


dai 


/ 


^inax 


AE 




(kg) 


(ra) 


(ra) 


(kg) 


(ra) 


(ra) 


(10-*) 


(mra) 


(%) 


Lo. bound 





-0.333 








-0.333 











Up. bound 


1 


0.333 


0.167 


2 


0.333 


0.167 








Run 1 


0.627 


-0.128 


0.130 


2.000 


-0.333 


0.167 


1.31 


2.388 


11.4 


Run 2 


0.997 


-0.094 


0.086 


1.999 


-0.333 


0.167 


1.28 


2.375 


13.1 


Run 3 


0.665 


-0.148 


0.000 


1.999 


-0.333 


0.167 


1.27 


2.370 


11.6 


Run 4 


0.707 


-0.139 


0.042 


1.989 


-0.333 


0.080 


1.68 


2.465 


12.6 


Run 5 


0.535 


-0.175 


0.000 


1.999 


-0.333 


0.167 


1.28 


2.376 


11.0 


Run 6 


0.200 


-0.325 


0.167 


2.000 


-0.333 


0.167 


1.35 


2.406 


9.5 


Run? 


0.905 


-0.111 


0.055 


1.994 


-0.333 


0.167 


1.27 


2.371 


12.6 


Run 8 


0.625 


-0.148 


0.066 


1.999 


-0.333 


0.167 


1.28 


2.375 


11.3 


Run 9 


0.731 


-0.130 


0.069 


1.999 


-0.333 


0.167 


1.28 


2.374 


11.9 


Run 10 


0.900 


-0.088 


0.125 


1.986 


-0.333 


0.005 


1.91 


2.364 


13.3 



Table 5 Case 2 - bounds and optimization results 





'"a, 


.5„, 


4, 


'"fl2 


Sa, 


da2 


,/ 


^fnax 


AE 




(kg) 


(ra) 


(ra) 


(kg) 


(ra) 


(m) 


(10-*) 


(ram) 


(%) 


Lo. bound 





-0.333 








-0.333 











Up. bound 


1 


0.333 


0.167 


2 


0.333 


0.167 








Run 1 


0.408 


-0.195 


0.119 


1.573 


-0.333 


0.167 


4.67 


2.410 


9.1 


Run 2 


0.203 


-0.333 


0.118 


1.635 


-0.333 


0.166 


4.41 


2.416 


8.5 


Run 3 


0.218 


-0.327 


0.066 


1.610 


-0.333 


0.167 


4.42 


2.410 


8.4 


Run 4 


0.287 


-0.262 


0.117 


1.896 


-0.333 


0.034 


5.34 


2.360 


9.9 


Run 5 


0.228 


-0.319 


0.067 


1.665 


-0.333 


0.167 


4.44 


2.445 


8.8 


Run 6 


0.342 


-0.210 


0.159 


1.576 


-0.333 


0.167 


4.59 


2.415 


8.8 


Run 7 


0.364 


-0.207 


0.137 


1.578 


-0.333 


0.167 


4.61 


2.411 


8.9 


Run 8 


0.218 


-0.327 


0.061 


1.606 


-0.333 


0.167 


4.40 


2.409 


8.4 


Run 9 


0.210 


-0.333 


0.080 


1.787 


-0.333 


0.162 


4.53 


2.547 


9.5 


Run 10 


0.241 


-0.294 


0.120 


1.643 


-0.333 


0.166 


4.45 


2.420 


8.7 



The results of Case 1-3 given in Tables 4-6 show, that after optimization runs 
of only 120 5 a viable design for the manipulator with two passive joints is found. 
It must be mentioned, that only about 1 in 10 simulation runs do not yield a non- 
minimum phase design within 120 i', whereby these runs are not accounted for in 
the presented tables. This shows, that the presented approach is a very time-efficient 
method to design minimum phase underactuated systems. For each of the three 
cases, most of the presented optimizations yield similar parameter sets. However, 
there are some variations in the different designs, indicating local minima. 

It is worth to notice, that the additional masses are always placed as counter- 
weights, whereas the mass added to the second un-actuated arm is significantly 
larger than the one added to the first un-actuated arm. The most energy-efficient 
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Table 6 Case 3 - bounds and optimization results 





'«o, 


Sat 


rffli 


m„ 


^fl. 


d.. 


./ 


^ma.\ 


AE 




(kg) 


(m) 


(m) 


(kg) 


(m) 


(m) 


(lo--*) 


(mm) 


(%) 


Lo. bound 





-0.333 








-0.333 











Up. bound 


1 


0.333 





2 


0.333 











Run 1 


0.993 


-0.108 


0.000 


2.000 


-0.333 


0.000 


1.79 


2.386 


13.8 


Run 2 


0.802 


-0.128 


0.000 


1.998 


-0.333 


0.000 


1.81 


2.383 


12.9 


Run 3 


0.999 


-0.107 


0.000 


1.998 


-0.333 


0.000 


1.79 


2.387 


13.9 


Run 4 


0.862 


-0.121 


0.000 


2.000 


-0.333 


0.000 


1.80 


2.385 


13.3 


Run 5 


0.839 


-0.123 


0.000 


1.999 


-0.333 


0.000 


1.80 


2.384 


13.2 


Run 6 


0.727 


-0.138 


0.000 


2.000 


-0.333 


0.000 


1.81 


2.383 


12.6 


Run? 


0.845 


-0.123 


0.000 


1.999 


-0.333 


0.000 


1.81 


2.384 


13.2 


Run 8 


0.814 


-0.126 


0.000 


2.000 


-0.333 


0.000 


1.80 


2.384 


13.1 


Run 9 


0.717 


-0.140 


0.000 


1.998 


-0.333 


0.000 


1.82 


2.382 


12.6 


Runic 


0.505 


-0.182 


0.000 


1.998 


-0.333 


0.000 


1.85 


2.377 


11.6 



designs are obtained by Case 2. This is due to the fact that in the criteria computa- 
tion also the added mass is considered. In this case an average of 8.9% of additional 
energy is required for tracking the half-circular trajectory. Case 1 which does not 
account for the additional mass in the criteria computation yield designs with larger 
mass increase. An average of 11.8% of additional energy is required in order to 
tracking the half-circular trajectory. Evaluating the squared error of the un-actuated 
coordinates as given by (22) shows, that in Case 1 the damping property of the 
zero-dynamics is better than in Case 2. Both, Cases 1 and 2 yield an offset da- 
If no offset is allowed, the added masses have to be larger, as shown by Case 3. This 
yields an increase of the consumed energy by about 13.1% for the half-circular tra- 
jectory. Also the damping property of the zero-dynamics is worse than in Case 1 , as 
indicated by the criteria value /. The damping property is comparable to the ones 
in Case 2. Thus, the use of an offset da improves the performance significantly. 



5 Conclusions 

A design approach for minimum phase underactuated multibody systems is pre- 
sented. This design approach is based on the optimization of the zero-dynamics, and 
yields a stable and robust internal dynamics with good damping properties. Then, 
for trajectory tracking feedback linearization is possible, and also feed-forward con- 
trol design is significantly simplified. As design parameters the mass distributions 
of the un-actuated bodies are identified. The mass distribution can be influenced 
by small masses which are added to the un-actuated bodies. For the optimization a 
particle swarm algorithm is used. Due to the two-stage optimization criteria calcu- 
lation the optimization procedure is very time-efficient and yields reliable results. 
The efficiency of the approach is demonstrated by simulation using two planar un- 
deractuated manipulators with one and two passive joints, respectively. Only a small 
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increase of the total mass is necessary to achieve a feedback linearizable system. In 
the future a more sophisticated design parametrization is aspired to achieve mini- 
mum phase underactuated multibody systems without an increase of the total mass. 
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GPU-Based Parallel Computing 
for the Simulation of Complex Multibody 
Systems with Unilateral and Bilateral 
Constraints: An Overview 

Alessandro Tasora, Dan Negrut, and Mihai Anitescu 



Abstract This work reports on advances in large-scale multibody dynamics 
simulation facilitated by the use of the Graphics Processing Unit (GPU). A de- 
scription of the GPU execution model along with its memory spaces is provided to 
illustrate its potential parallel scientific computing. The equations of motion associ- 
ated with the dynamics of large system of rigid bodies are introduced and a solution 
method is presented. The solution method is designed to map well on the parallel 
hardware, which is demonstrated by an order of magnitude reductions in simulation 
time for large systems that concern the dynamics of granular material. One of the 
salient attributes of the solution method is its linear scaling with the dimension 
of the problem. This is due to efficient algorithms that handle in linear time both 
the collision detection and the solution of the nonlinear complementarity problem 
associated with the proposed approach. The current implementation supports the 
simulation of systems with more than one million bodies on commodity desktops. 
Efforts are under way to extend this number to hundreds of millions of bodies on 
small affordable clusters. 



1 Introduction 

Gauging through simulation the mobility of tracked and/or wheeled vehicles on 
granular terrain (sand and/or gravel) for commercial (construction equipment in- 
dustry), military (off-road mobility), and deep space exploration (Rover mobility 



A. Tasora (Kl) 

University of Parnia, Parma, Italy 

e-mail: tasora@ied.unipr.it 

D. Negrut 

University of Wisconsin, Madison, WI-53706, USA 

e-mail: negrut@wisc.edu 

M. Anitescu 

Ai'gonne National Laboratory, Argonne, IL-60439, USA 

e-mail: anitescu@mcs.anl.gov 



K. Arczewski et al. (eds.), Multibody Dynamics: Computational Methods 283 

and Applications, Computational Methods in Applied Sciences 23, 

DOI 10.1007/978-90-481-9971-6_14, © Springer Science-FBusiness MediaB.V. 2011 



284 A. Tasora et al. 

on Martian terrain) applications leads to very challenging multibody dynamics 
problems. In the past, when applicable, the only feasible approach to these and other 
granular dynamics dominated problems was to approximate the discrete nature of 
the material with a continuum representation. For the classes of problems of interest 
here, such as material mixing, vehicle mobility on sand, piling up of granular bulk 
material, the flow in pebble bed nuclear reactors, rate of flow in silos, stability of 
brick buildings to earthquakes, etc., a continuum representation of the problem is ei- 
ther inadequate or paints with too wide of a brush the dynamics of interest. Tackling 
head on the discrete problem, characterized by a large number of bodies that inter- 
act through frictional contact and might have vastly different mass/inertia attributes, 
has not been feasible in the past. 

The computational multibody dynamics landscape has experienced recently 
changes fueled by both external and internal factors. In terms of the former, se- 
quential computing appears to lose momentum at a time when the microprocessor 
industry ushers in commodity many-core hardware. In terms of internal factors, 
contributions made in understanding and handling frictional contact [1-11], have 
led to robust numerical algorithms that can tackle sizeable granular dynamics prob- 
lems. This paper discusses how the interplay of these two factors will enable in 
the near future a discrete approach to investigating the dynamics of systems with 
hundreds of millions of rigid bodies. 

The paper is organized as follows. Section 2 starts with a brief discussion of three 
roadblocks that adversely impact the potential of sequential computing and limit 
its future role in computational science in general, and computational multibody 
dynamics in particular. An argument is made that in large scale multibody dynam- 
ics emphasis should be placed on implementations that can leverage commodity 
high performance parallel computing. In this context, an overview is presented of 
NVIDIA's hardware architecture, which is adopted herein when tackling large scale 
multibody dynamics problems. The discussion focuses on a description of the par- 
allel execution model, execution scheduling, and memory layout. Section 3 details 
how large scale frictional contact problems associated with granular dynamics are 
solved by a computational approach that maps well onto parallel execution hard- 
ware available on the GPU. The approach implemented has two compute intensive 
parts: the solution of a cone complementarity problem (CCP) and the resolution 
of a collision detection (CD) analysis. In both cases, the solution embraced draws 
on parallel computing and a discussion of the CCP algorithm adopted concludes 
Section 3. Section 4 demonstrates the use of the solution approach implemented. 
First, the paper briefly reports on the largest granular dynamics problems solved us- 
ing the methodology discussed in Section 3. Next, a pebble bed nuclear reactor flow 
problem compares the efficiency of the parallel implementation on the GPU to that 
of the sequential implementation. The paper closes with concluding remarks and a 
discussion of future directions of research. 
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2 Review of Computing on the Graphics Processing Unit 

As pointed out in [12], three road blocks prevent traditional sequential computing 
from experiencing future major gains in flop rate: the memory block, the instruction 
level parallelism block, and the power dissipation block. The first one is a conse- 
quence of the fact that as the data processing power of a CPU core increases, the 
number of memory transactions in the time unit also goes up. From 1986 to 2000, 
CPU speed improved at an annual rate of 55% while memory access speed only 
improved at a 10% rate. One outcome of this trend was an increase in the likelihood 
of cache misses, which have been partially alleviated by employing hyper-threading 
technologies and considering ever increasing cache memories. Nonetheless, cache 
misses occur and they lead to the CPU waiting for chunks of data moved over a 
32.5 GB/s connection that currently connects the CPU to the RAM. The second 
block stems from the exhaustion of the idea of speculative execution of future 
instructions to produce results ahead of time and make them available to the pro- 
cessor in case the actual computational path was correctly anticipated. However, 
this speculative execution strategy necessitates power and is plagued by a combina- 
torial increase in the number of possible computational paths. This translates into 
a short future execution horizon that can be sampled by these techniques. The at- 
tractive attribute of this strategy is that the programmer doesn't have to do anything 
to speed up the code. Instead, the CPU takes upon itself the task of employing this 
strategy. On the flip side, this avenue of speeding up execution has been thoroughly 
taken advantage of and its potential has been already fulfilled. Thirdly, the amount 
of power dissipated by a CPU/unit area has approached that of a nuclear plant [13]. 
Since the power dissipated is proportional to the square of the microprocessor clock 
frequency, it becomes apparent that significant microprocessor frequency increases, 
which were primarily responsible for past reductions in computational times in com- 
modity scientific computing, are a thing of the past. 

One bright spot in this bleak background against which the future of commodity 
hardware for scientific computing is projected comes from the consensus in the 
microprocessor industry that for at least one more decade Moore's law will hold. 
The law states that the number of transistors that can be placed inexpensively on 
an integrated circuit is doubling approximately every 2 years. Since this translates 
into a steady increase in the number of microprocessors that can be packed on the 
unit area, Moore's law indirectly defines the source of future increases in flop rate 
in scientific computing. Specifically, rather than hoping for frequency gains, one 
will have to count on an increase in number of cores as the means for speeding up 
simulation. 

Figure 1 confirms this trend by comparing top flop rates for the CPU and GPU. 
Since the plot compares double precision (DP) CPU flop rates with single precision 
(SP) rates for the GPU, the relevant point is not made by the absolute values. Rather, 
the trends are more important: the slope for the CPU is significantly smaller than 
that of the GPU. Table 1 partially explains the momentum behind parallel comput- 
ing on the GPU. The last generation of NVIDIA cards packs 1 .4 billion transistors, 
reaching 3 billion with the release of Fermi in early 2010, to produce a GPU with 
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Fig. 1 Evolution of flop rate, comparison CPU vs. GPU 



Table 1 CPU vs. GPU comparison. Flop rates reported 
are in single precision (SP) for the GPU and double preci- 
sion (DP) for the CPU 





TeslaClOeO 


Intel 17 975 Extreme 


Cores 


240 




4 


Memory 


4 GB 




32KB LI cache/core 
256 KB L2 cache/core 
8 MB L3 for all cores 


Clock 


1.33 GHz 




3.20 GHz 


Bandwidth 


102GB/S 




32.0 GB/s 


FLOPS 


933 X 10'' 


(SP) 


70 X 10' (DP) 



240 scalar processors, or 512 on Fermi. Their clock frequency is lower, that is, 
1.33GHz, thus partially alleviating the heat dissipation issue. Yet, the GPU compen- 
sates through a larger memory bandwidth (likely to increase to more than 200 GB/s 
on Fermi) and sheer number of scalar processors. 

The idea of using the graphics card for scientific computing dates back more 
than one decade. Their use was motivated by the sheer amount of computational 
power available on the GPU. Fueled by a steady demand for a more realistic video 
game experience, the GPU experienced a continuous increase in flop rate to facili- 
tate the rendering of more realistic visual effects at a rate of 20 frames/s or higher. 
The original graphics pipeline operated through graphics shaders and was meant to 
perform the same set of operations on multiple data sets. The data here is the infor- 
mation associated with a pixel; the operations were the set of instructions necessary 
to determine the state of each pixel of the screen. At high resolutions, this required 
a large number of threads to process in parallel the information that would make 
possible the output of one frame. This computational model, in which one set of 
instructions is applied for many instances of data, is called SIMD (single instruc- 
tion multiple data). It is the paradigm behind processing data on the GPU and was 
leveraged before 2006 by drawing on existing graphics application programming 
interfaces (API) such as OpenGL and DirectX. 
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However, scientific computing through a graphics API was both cumbersome 
and rigid. It was cumbersome since any data processing task had to be cast into a 
shader operation. This either required a lot of imagination, or outright prevented 
one from using GPU computing for more comphcated tasks. The approach was also 
rigid in that it only allowed a limited number of memory transaction operations 
(for instance one thread could only write to one memory location), it lacked certain 
arithmetic operations (such as integer and bit operations), and implementation of 
the IEEE754 standard for arithmetic operations was of secondary importance. 

The GPU computation landscape was revolutionized by the release in 2006 of 
the version 1.0 of the CUDA Software Development Kit (SDK) and library [14], 
which eliminated the vast majority of the barriers that prevented the use of the GPU 
for scientific computing. CUDA allows the user to write "C with extensions" code 
to directly tap into the computational resources of the GPU through a run-time API. 
The CPU, typically called the host, is linked to the GPU, called the device, through 
a Peripheral Component Interconnect Express 2.0 (PCIe 2.0 x 16) connection. This 
connection supports an 8.0GB/s data transfer rate and represents the conduit for 
data exchange between the host and device. 

The hardware layout of the latest generation of NVIDIA graphics cards for scien- 
tific computation called Tesla is schematically shown in Fig. 2. The GPU is regarded 
as one big Stream Processor Array (SPA) that for the Tesla CI 060 hosts a collection 
of 10 Texture Processor Clusters (TPC). Each TPC is made of a texture block (called 
TEX in Fig. 2), and more importantly, of three Stream Multiprocessors (SM). The 
SM, sometimes also called the multiprocessor, is the quantum of scalability for the 
GPU hardware. Thus, entry level graphics cards might have four SMs, such as is 
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Fig. 2 Hardware layout for the Tesla C1060 card. The SPA has ten TPCs, each with three SMs, 
each of which has eight SPs for a total of 240 SPs 
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the case for GPUs like NVIDIA's 9700M GT which are used in computer laptops. 
High end GPUs, such as the NVIDIA GTX 280, have 30 SMs. The Tesla C1060 has 
also 30 SMs since the SPA has ten TPCs, each with three SMs. Finally, each SM 
has eight Scalar Processors (SP). It is these SPs that eventually execute the instruc- 
tions associated with each function that is processed on the GPU. Specifically, the 
device acts as a co-processor for the host, which sends down to the device tasks for 
parallel execution. For this computational model to be effective, at least two require- 
ments must be met. First, the ratio of arithmetic operations to data transfer should 
be high enough to cover the transfer overhead associated with the 8.0GB/s data 
transfer from host to device for processing, and then back to the host for subsequent 
use. Second, the task sent for completion on the GPU, encapsulated in a C function 
called kernel, should have a high level of fine grain SIMD type parallelism. 

For effective use of the available SMs, a kernel function must typically be ex- 
ecuted by a number of threads in excess of 30,000. In fact, the more threads are 
launched, the larger the chance of full utilization of the GPU's resources. It should 
be pointed out that there is no contradiction in 240 SPs being expected to process 
hundreds of thousands or millions of parallel invocations of a kernel function. In 
practice, the largest number of times a kernel can be asked to be executed on Tesla 
C1060 is more than two trillion (65,535 x 65,535 x 512) times. 

When discussing about running kernels on the GPU, it is important to make a 
distinction between being able to execute a kernel function a large number times, 
and having these executions run in parallel. In practice, provisions should be made 
that there are enough instances of the kernel function that are lined up for execu- 
tion so that the 240 SPs never become idle. This explains the speed-ups reported 
in conjunction with GPU computing when applications in image processing, quan- 
tum chemistry, and finance have run up to 150 times faster on the GPU although 
the peak flop rate is less than 10 times higher when compared to the CPU. For the 
latter, cache misses place the CPU in idle mode waiting for the completion of a 
RAM transaction. Conversely, when launching a job on the GPU that calls for a 
very large number of executions of a kernel function, chances are that the scheduler 
will always find warps, or collections of threads, that are ready for execution. In this 
context, the SM scheduler is able to identify and park with almost zero overhead the 
warps that wait for memory transaction completion and quickly feed the SM with 
warps that are ready for execution. The SM scheduler (which manages the "Instruc- 
tion Fetch/Dispatch" block in Fig. 2) can keep tabs on a pool of up to 32 warps of 
threads, where each warp is a collection of 32 threads that are effectively executed 
in parallel. Thus, for each SM, the scheduler jumps around with very little overhead 
in an attempt to find, out of the 32 active warps, the next warp ready for execution. 
This effectively hides memory access latency. 

Note that the number of threads that are executed in parallel (32 of them), is 
typically orders of magnitude smaller than the number of times the kernel function 
will be executed by a user specified number of threads. The latter can be specified 
through a so called execution configuration, which is an argument passed along 
to the GPU with each kernel function call. The execution configuration is defined 
by specifying the number of blocks of threads that the user desires to launch. 
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The maximum number of blocks is 65,535 x 65,535; i.e., one can specify a two 
dimensional grid of blocks. Additionally, one has to indicate the number of threads 
that each block will be made up of. There is an upper limit of threads in a block, 
which currently is set to 512. When invoking an execution configuration, that is 
a grid of m blocks each with n of threads, the kernel function that is invoked to 
be executed on the device will be executed a number of m x n times. In terms of 
scheduling, the m blocks are assigned to the available SMs and therefore a high 
end GPU comes ahead since the m blocks will end up assigned to four SMs on an 
entry level GPU, or to 30 SMs on a high end GPU. The assignment of blocks to 
SMs might lead to the simultaneous execution of more than one block/SM. Yet, this 
number cannot be larger than eight, which is more than sufficient since when they 
land on the same SM the eight blocks of threads are supposed to share resources. 
Indeed, due to the limited number of registers and amount of shared memory avail- 
able on a SM, a sharing of resources between many threads (« x the number of 
blocks executed on the SM) makes very unlikely the scenario of having a large 
number of blocks simultaneously running on one SM. 

In terms of block scheduling, as one block of threads finishes the execution of 
the kernel function on a certain SM, another block of threads waiting to execute is 
assigned to the SM. Consequently, the device should be able to do scheduling at 
two levels. The first is associated with the assignment of a block to an SM that is 
ready to accept a new block for execution. What simplifies the scheduling here is 
the lack of time slicing associated with block execution: if a block is accepted for 
execution on an SM, no other block is accepted by that SM before it finishes the 
execution of a block that it is already dealing with. The second level of scheduling, 
which is more challenging, has to do with the scheduling for execution of one of the 
potentially 32 warps of threads that each SM can handle at any given time. Note that 
all the 32 threads in one warp execute the same instruction, even though this means, 
Uke in the case of if-then-else statements, serializing the code of the if-branches 
and running no-ops for certain threads in the warp (this thread divergence adversely 
impacts overall performance and should be avoided whenever possible). However, 
when switching between different warps, the SM typically executes different in- 
structions when handling different warps; in other words, time slicing is present in 
thread execution. 

In conclusion, one Tesla CI 060, can be delegated with the execution of a kernel 
function up to approximately 2 trillion times. However, at each time, since there 
are 30 SMs available in this card, it will actively execute at most 30,720 = 30 x 32 
warps X 32 threads at any time. Moreover, as shown in Fig. 3, existing motherboards 
can accommodate up to four Tesla CI 060 cards, which effectively supports up to 
122,880 = 4 X 30,720 threads being active at the same time. The single precision 
flop rate of this setup is approximately 3,600 billion operations/s. 

It was alluded before that one of the factors that prevent an SM from actually 
running at full potential; i.e., managing simultaneously 32 warps of threads, is the 
exhaustion of shared memory and/or register resources. Each SM has 16KB of 
shared memory in addition to 16,384 four byte registers. If the execution of the 
kernel function requires a large amount of either shared memory or registers, it is 
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Fig. 3 Image of GPU and desktop with a set of four cards that can be controlled by one CPU. 
There is no direct memory access between the four GPUs. The HW configuration in the figure 
is as follows. Processor: AMD Phenom II X4 940 Black Edition. Power supply 1: Silverstone 
OPIOOO-E lOOOW. Power supply 2: Corsair CMPSU-750TX 750W. Memory: G.SKILL 16GB (4x 
4 GB) 240-Pin DDR2. Case: LIAN LI PC-P80 ATX Full Tower. Motherboard: Foxconn Destroyer 
NVIDIA nForce 780a SLI. HDD: Western Digital Caviar Black 1TB 7200 RPM 3.0Gb/s. HSF: 
Stock AMD. Graphics: 4x NVIDIA Tesla C1060 



clear that the SM does not have enough memory available to host too many threads 
executing the considered kernel. Consequently, the ability of the SM to hide global 
memory access latencies with arithmetic instructions decreases since there are less 
warps that it can switch between. 

In addition to shared memory and registers, as shown in Fig. 4, each thread has 
access to global memory (4 GB of it on a Tesla C1060), constant memory (64 KB), 
and texture memory, the latter in an amount that is somewhat configurable but close 
to the amount of constant memory. Additionally, there is so called local memory 
used to store data that is not lucky enough to occupy a register and ends up in the 
global memory (register overflow). Effectively, local memory is virtual memory that 
is carved out of the global memory and, in spite of the word "local", it is associated 
with high latency. In this context, accessing data in registers has practically no la- 
tency, shared memory transactions have less than four clock cycles of latency, as 
do cached constant and texture memory accesses. Global memory transactions are 
never cached and, just like un-cached constant and texture memory accesses or ac- 
cesses to local memory, they incur latencies of the order of 600 clock cycles. Note 
that typically the device does not have direct access to host memory. There are ways 
to circumvent this by using mapped page-locked memory transactions, but this is an 
advanced feature not discussed here. 

For the GPU to assist through co-processing a job run on the CPU, the host 
must first allocate memory and move data through the PCI connection into the de- 
vice memory (global, texture, or constant memory spaces). Subsequently, a kernel 
function is launched on the GPU to process data that resides in the device mem- 
ory. At that point, blocks of threads executing the kernel function access data 
stored in device memory. In unsophisticated kernels they can immediately pro- 
cess the data; alternatively, in more sophisticated kernel functions, they can use the 
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Fig. 4 GPU memory layout. 
Device memory refers to the 
combination of global, 
texture, and constant memory 
spaces. Arrows indicate the 
way data can move between 
different memory spaces and 
SM. While the device 
memory is available to 
threads running on any SM, 
the registers, shared memory, 
and cached constant and 
texture data is specific 
to each SM 



stream f^ultlnrocessor 
Cone af several) 




Device MermHry 
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shared memory and registers to store the data locally and thus avoid costly device 
memory accesses. If avoiding repeated data transfers between host and device is 
the first most important rule for effective GPU computing, avoiding repeated high- 
latency calls to device memory is the second most important rule to be observed in 
GPU computing. It should be pointed out that device memory access can be made 
even more costly when the access is not structured (uncoalesced). Using CUDA 
terminology, the device memory accesses result in multiple transactions if the data 
accessed by a warp of threads is scattered rather than nicely coalesced (contiguous) 
in memory. For more details, the interested reader is referred to [14]. 

One common strategy for avoiding race conditions in parallel computing is the 
synchronization of the execution at various points of the code. In CUDA, synchro- 
nization is possible but with a caveat. Specifically, threads that execute the kernel 
function yet belong to different blocks cannot be synchronized. This is a conse- 
quence of the earlier observation that there is no time slicing involved in block 
execution. When there are thousands of blocks that are lined up for execution wait- 
ing for their turn on one of the 30 SM of a Tesla CI 060, it is clear that there can be no 
synchronization between a thread that belongs to the first block and one that belongs 
to the last block that might get executed much later and on a different SM. Overall 
synchronization can be obtained by breaking the algorithm in two kernel functions 
right at the point where synchronization is desired. Thus, after the execution of the 
first kernel the control is rendered back to the host, which upon the invocation of 
the subsequent kernel ensures that all threads start on equal footing. This approach 
is feasible since the device memory is persistent between subsequent kernel calls 
as long as they are made by the same host process. The strategy works albeit at a 
small computational cost as there is an overhead associated with each kernel call. 
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Specifically, the overhead of launching a kernel for execution is on average between 
90 |xs (when no function arguments are present) and 120 |xs (when arguments such 
as pointers to device memory are passed in the kernel argument list). 

Looking ahead, the next generation of GPU hardware and CUDA software will 
make the heterogeneous computing model, where some tasks are executed by the 
host and other compute intensive parts of the code are delegated to the GPU, even 
more attractive. Slated to be released by March 2010, the Fermi family of GPUs will 
have 512 SPs in one SM and up to 1 TB of fast Graphics Double Data Rate, version 
5 (GDDR5) memory. Moreover, the current weak double precision performance of 
the GPU (about eight times slower than single precision peak performance) will 
be improved to clock at half the value of the single precision peak performance. 
Finally, on the software side, the CUDA run-time API will provide (a) support for 
stream computing where expensive host-device data moving operations can be over- 
lapped with kernel execution, and (b) a mechanism to simultaneously execute on 
the device different kernels that are data independent. It becomes apparent that if 
used for the right type of applications, that is, when the execution bottleneck fits 
the SIMD computational model, and if used right, GPU computing can lead to im- 
pressive reductions in computational time. Combined with its affordability attribute, 
GPU computing will allow scientific computing to tackle large problems that in the 
past fell outside the realm of tractable problems. The class of granular dynamics 
problems is one such example, where a discrete approach to equation formulation 
and solution was not feasible in most cases in the past. 



3 Large Scale Multibody Dynamics on the GPU 

This section briefly introduces the theoretical background for mechanical systems 
made up of multiple rigid bodies whose time evolution is controlled by external 
forces, frictional contacts, bilateral constraints and motors. 



3.1 The Formulation of the Equations of Motion 

The state of a mechanical system with «/, rigid bodies in three dimensional space 
can be represented by the generalized coordinates 



q=[rf,6f,...,r„^^,€jJ^eK'"* 



and their time derivatives 



q=[rf,6f,...,r„^^,cjj 



T 



e 
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where rj is the absolute position of the center of mass of the j -th body and the 
quaternion 6/ expresses its rotation. One can also introduce the generalized veloc- 
ities V = [rf , ctff , . . . , rj^ ,d>^ ]^ G M^"*, directly related to q by means of the 
linear mapping q = L(q)v that transforms each angular velocity cot (expressed in 
the local coordinates of the body) into the corresponding quaternion derivative e, by 
means of the linear algebra formula e, = jG{ej)a)i, with 



Giej) = 



+ci +eo -C3 +^2 
+C2 +€3 +co -ei 
.+63 -ei +ei +eo_ 



Mechanical constraints, such as revolute or prismatic joints, can exist between 
the parts: they translate into algebraic equations that constrain the relative position 
of pairs of bodies. Assuming a set B of constraints is present in the system, they 
lead to the scalar equations 

*, (q, t) =0, / G B. 

To ensure that constraints are not violated in terms of velocities, one must also 
satisfy the first derivative of the constraint equations, that is 

V*f V + — -^ = 0, i € B. 
at 

with the Jacobian matrix V^*, = [d'iii/dq]^ and V*f = V^*f L(q). Note that 
the term 9>I',/9? is null for all scleronomic constraints, but it might be nonzero for 
constraints that impose some trajectory or motion law, such as in the case of motors 
and actuators. 

If contacts between rigid bodies must be taken into consideration, colliding 
shapes must be defined for each body. A collision detection algorithm must be used 
to provide a set of pairs of contact points for bodies whose shapes are near enough, 
so that a set A of inequalities can be used to concisely express the non-penetration 
condition between the volumes of the shapes: 

<l>,(q)>0, i G A 

Note that for curved convex shapes, such as spheres and ellipsoids, there is a 
unique pair of contact points, that is the pair of closest points on their surfaces, but 
in case of faceted or non-convex shapes there might be multiple pairs of contact 
points, whose definition is not always trivial and whose set may be discontinuous. 

Given two bodies in contact A, B G {1,2,..., n/,} let n, be the normal at the 
contact pointing toward the exterior of body A, and let u, and w, be two vectors 
in the contact plane such that n,- , u, , w/ G R-' are mutually orthogonal vectors. 
When a contact / is active, that is, for $/ (q) = 0, the frictional contact force acts on 
the system by means of multipliers y,-^„ > 0, y/,a, and y,-^„.. Specifically, the normal 
component of the contact force acting on body B is F,^7v = Y/,«n, and the tangential 
component is F,-^7- = Y,-,«u, -I- Y,-,,vW/ (for body A these forces have the opposite 
sign). 
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Also, according to the Coulomb friction model, in case of nonzero relative 
tangential speed, yt^r, the direction of the tangential contact force is aligned to v/^r 
and it is proportional to the normal force as \\Fi^T \\ = fJ-i,d l|F;,jv || by means of the 
dynamic friction coefficient /x, j e M"*" . However, in case of null tangential speed, 
the strength of the tangential force is limited by the inequality ||F/,7-|| = /x/^i||F/^Ar|| 
using a static friction coefficient /x,,^ e M"*", and its direction is one of the infinite 
tangents to the surface. In our model we assume that /x,^j and /x, ., have the same 
value that we will write /x, for simplicity, so the abovementioned Coulomb model 
can be stated succinctly as follows: 

Yi,n > 0, $,(q) > 0, $,(q)y,,„ = 0, 



,n > y/9t 



f^iYi,n > JYu, + yL 



(F/,r.v/,r) = -||F/,r|| llv^rll 
l|v/,r|| UiYi^n - ^Jy^, + ylJ = 

Note that the condition y, „ > 0, $, (q) > 0, <I>, (q)Y,,« = can also be written as 
a complementarity constraint: ■/;,« > _L <I>, (q) > 0, see [15]. This model can also 
be interpreted as the Karush-Kuhn-Tucker first order conditions of the following 
equivalent maximum dissipation principle [6, 16]: 

(yi,u, yi,iv) = argmin yf^iyt^uUi + y,,„w,). (1) 

Finally, one should also consider the effect of external forces with the vector of 
generalized forces f(f , q, v) e M^"* , that might contain gyroscopic terms, gravita- 
tional effects, forces exerted by springs or dampers, and torques applied by motors; 
i.e. all forces except joint reaction and frictional contact forces. 

Considering the effects of both the set A of frictional contacts and the set B of 
bilateral constraints, the system cannot be reduced to either a set ordinary differen- 
tial equations (ODEs) of the type v = /(q, v, t), or to a set of differential-algebraic 
equation (DAEs). This is because the inequalities and the complementarity con- 
straints turn the system into a differential inclusion of the type v e Tiq, \J), where 
!F{-) is a set-valued multifunction [17]. In fact, the time evolution of the dynamical 
system is governed by the following differential variational inequality (DVI): 

q = L(q)v 

Mv = f(f.q,v) + ^y;,z,V*,- 
ieB 

+ XI (yi,n^i,n + yi,u D;,„ + Yi,w Di,w) 
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i €B : *,(q,0 = 
i € A : Yi,„ > 0_L $,(q) > 0, and 
(y/,«, Yi,w) = arg min 



v^(y,-,„D,-,„ + y,-,„. D/,,v) 



(2) 



Here, to express the contact forces in generalized coordinates, we used the tan- 
gent space generators Z),- = [D/,„,D,-^„,D,-^it] e ^^"h^^ ji^at are sparse and are 
defined given a pair of contacting bodies A and B as: 



dT = L 



<p AlpAASuA 







/4^ 



-AT„Abs, 



i,B ...] 



(3) 



Here Aj^p = [n,, u/,w,] is the R^'*-' matrix of the local coordinates of the i-th 
contact, and the vectors s,,^ and s,,b to represent the positions of the contact points 
expressed in body coordinates. The skew matrices s,,^ and s,,^ are defined as 



Si, A 



-Si, A, +Si,Ay 

+Si,A, -Si, Ax 

-Si,Ay +^/,/l.v 



,Si,B 



-Si,B, +Si,By 

+Si,B, -Si,B:, 

-Si,By +^i,B.v . 



The DVI in (2) can be solved by time-stepping methods. The discretization 
requires the solution of a complementarity problem at each time step, and it has 
been demonstrated that it converges to the solution to the original differential inclu- 
sion for h —^ [15, 18]. Moreover, the differential inclusion can be solved in terms 
of vector measures: forces can be impulsive and velocities can have discontinuities, 
thus supporting also the case of impacts and giving a weak solution to otherwise 
unsolvable situations like in the Painleve paradox [19]. 



3.2 The Time Stepping Solver 



Within the aforementioned measure differential inclusion approach, the unknowns 
are not the reaction forces and the accelerations v as in usual ODEs or DAEs. 
Instead, given a position q^'^ and velocity v^'^ at the time step f '^'\ the unknowns are 
the impulses ys, for s = n, u, w, b (that, for smooth constraints, can be interpreted 
as y„ = hy„,yu = hy,i,y„ = hy„,yh = hyt,) and the speeds v^'"*"^^ at the new 
time step r^'+i' = ?(') -|- h. These unknowns are obtained by solving the following 
optimization problem with equilibrium constraints [2]: 



M(v('+i) - v(')) = M(?('),q('',v(')) + ^y,-,6V*,- 

ieB 
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h at 

I &A: < -<I>;(q^'^) + Df„v^'+i^± y'„ > 0, 
h ' 

(Yi,u, Yi,») = arg min \^(Yi,u D/,« + yi,w D;,h) 

M/y/.«>-/y,?„+y,?„, 

q('+i)= q(')+/iL(q('))v('+i). (4) 

The ^$, (q ) term is introduced to ensure contact stabilization, and its effect 
is discussed in [3]. Similarly, the term ^*, (q*^'^) achieves stabilization for bilateral 
constraints. 

Several numerical methods can be used to solve (4). For instance, one can ap- 
proximate the Coulomb friction cones in 3D as faceted pyramids, thus leading to a 
LCP whose solution is possible by using off-the-shelf pivoting methods. However, 
these methods usually require a large computational overhead and can be used only 
for a limited number of variables. 

Therefore, in a previous work [20] we demonstrated that the problem can be 
cast as a monotone optimization problem by introducing a relaxation over the 
complementarity constraints, replacing < ^<I>, (q^'') + J)f„y'-^~^^^ _L y^ > with 

0<ji^i (q^^') + D,>v('+i^ - M,- V(v^D/,»)^ + (v^D,-,vv)^ ± Yn > »■ The solution 
of the modified time stepping scheme approaches the solution of the original dif- 
ferential inclusion for h — >■ just as the original scheme [3]. Most importantly, the 
modified scheme becomes a Cone Complementarity Problem (CCP), which can be 
solved efficiently by an iterative numerical method that relies on projected contrac- 
tive maps. Omitting for brevity some of the details discussed in [21], the algorithm 
makes use of the following vectors and matrices: 

yi,a = {y(,«, yi,u, y/,H}^, ' e A. 

Z,,-^i*,(q^'\0 + ?^. i^A (5) 

h dt 

The solution of the CCP is obtained by iterating the following expressions on r 
until convergence, or until r exceeds a maximum amount of iterations, starting from 

Wi€A: y[+' = nx, [y^a - mt {dJV + b,)] (6) 

V,- e ^ : y\+' = yii, - «/]/ (v^fV + b,- ) (7) 
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v'-+i = v'- + M-1 j ^ £>,y:+i + J2 ^*zK'r + ^ f(^^'^ ' q^'^ ' ^^'^n (8) 

VzS^ zei3 ' / 

Note that the superscript (/ + 1) was omitted for brevity. 

The iterative process uses the projector 11 x, (0^ which is a non-expansive metric 
map rix, : M^ — > R^ acting on the triplet of multipliers associated with the i-th 
contact [20]. In detail, if the multipliers fall into the friction cone 

T,={y,-,. GM3:(y,?„ + y,?,y/'<M,T,>} 

they are not modified; if they are in the polar cone 

T° = {x,- e R3 : (x,-, y,-,«) < 0, Vy/,« e T,} 

they are set to zero; in the remaining cases they are projected orthogonally onto 
the surface of the friction cone. The over-relaxation factor cd and r], parameters are 
adjusted to control the convergence. Interested readers are referred to [21] for a 
proof of the convergence of this method. 

For improved performance, the summation of Eq. (8) can be computed only once 
at the beginning of the CCP iteration, while the following updates can be performed 
using an incremental version that avoids adding the t{t^^\q^^\y^'^) term all the 
time; in case there is no initial guess for the multipliers and y°^ = 0, yP^ = 0, 
Eq. (8) turns into: 

vO=v(') + M-i/;f (?('), q('),v(')) (9) 

v'-+i =v'--f^Av,- (10) 



where 






In the case that only bilateral constraints are used, this method behaves like the 
typical fixed-point Jacobi iteration for the solution of linear problems. If one in- 
terleaves the update (8) after each time that a single i-th multiplier is computed in 
(6) or (7), the resulting scheme behaves like a Gauss-Seidel method. This variant 
can benefit from the use of Eq. (10) instead of Eq. (8) because it can increment only 
the Av,: term corresponding to the constraint that has been just computed. Also, 
this immediate update of the speed vector provides better properties of convergence 
(especially in case of redundant constraints) but it does not fit well in a parallel 
computing environment because of its inherently sequential nature. 
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3.3 The GPU Formulation of the CCP Solver 

Since the CCP iteration is a computational bottleneck of the numerical solution 
proposed, a great benefit will follow from an implementation that can take advantage 
of the parallel computing resources available on GPU boards. 

In the proposed approach, the data structures on the GPU are implemented 
as large arrays (buffers) to match the execution model associated with NVIDIA's 
CUDA. Specifically, threads are grouped in rectangular thread blocks, and thread 
blocks are arranged in rectangular grids. Four main buffers are used: the contacts 
buffer, the constraints buffer, the reduction buffer, and the bodies buffer. Since re- 
peated transfers of large data structures can adversely impact the performance of the 
entire algorithm, an attempt was made to organize the data structures in a way that 
minimized the number of fetch and store operations and maximized the arithmetic 
intensity of the kernel code. This ensures that the latency of the global memory can 
be hidden by the hardware multithread scheduler if the GPU code interleaves the 
memory access with enough arithmetic instructions. 

Figure 5 shows the data structure for contacts, which contains two pointers Ba 
and Bb to the two touching bodies. There is no need to store the entire /), matrix 
for the i-th contact because it has zero entries everywhere except for the two 12x3 
blocks corresponding to the coordinates of the two bodies in contact. In detail, we 
store only the following 3x3 matrices: 
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Fig. 5 Data structures in GPU global memory 
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Once the velocities of the two bodies r^ . , aiyj . , r^. and mbj have been fetched, 
the product DJV in Eq. (6) can be performed as 



D 



f V- = dIJ:a, + Dl^^coA, + Dl^TB, + D^^^cbB, (U) 

Since DJ^^ = —DJ^ , there is no need to store both matrices, so in each contact 
data structure only a matrix Df^, is stored, which is then used with opposite signs 
for each of the two bodies. 

Also, the velocity update vector Av,, needed for the sum in Eq. (10) is sparse: 
it can be decomposed in small 3x1 vectors. Specifically, given the masses and 
the inertia tensors of the two bodies m^. , niBi . Jaj and Jb, , the term Av, will be 
computed and stored in four parts as follows: 

Ars, =m^;A-,v^Ay,';+i, Ad>B^ = J^^ Di,^^AY[+' (12) 

Note that those four parts of the Av, terms are not stored in the i -th contact or 
data structures of the two referenced bodies (because multiple contacts may refer the 
same body, hence they would overwrite the same memory position). These velocity 
updates are instead stored in the reduction buffer, which will be used to efficiently 
perform the summation in Eq. (10). This will be discussed shortly. 

The constraints buffer, shown in Fig. 5, is based on a similar concept. Jacobians 
V*!*, of all scalar constraints are stored in a sparse format, each corresponding 
to four rows V*,-,„^, V*/,^,^, V*,,,,^, V*,,„^. Therefore the product V*f v*" in 
Eq. (7) can be performed as the scalar value: 

V*f v-- = V<„/^, + V<„^«^, + V^.^r^, + ^^l^^coB, (13) 

Also, the four parts of the sparse vector Av, can be computed and stored as 

Ar^, = m-^]V^i,,.^Ayl+\ A«^, = /j/V*,- „,, Ay,';+i 

Ars, = m^iV*,- ,^Ay,';+i, Ad>B, = /^/V*,- «^ Ay,';+i (14) 

Figure 5 shows that each body is represented by a data structure containing the 
state (velocity and position), the mass moments of inertia and mass values, and the 
external applied force F/ and torque Cj . Those data are needed to compute the CCP 
iteration and solve for unknowns. 

When it comes to the implementation of the CCP solver on the GPU, using ker- 
nels that operate on the abovementioned data buffers, the task is not trivial because 
the iteration cannot be performed with a single kernel. In fact, considering the iter- 
ation over Eqs. (6), (7), and (10), one can see that Eqs. (6) and (7) fit into parallel 
kernels that operate, respectively, one thread per contact and one thread per bilat- 
eral constraint. Moreover, the summation in Eq. (10) cannot be easily parallelized 
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Fig. 6 Example of reduction buffer for summing up body velocities 



in the same way because it may happen that two or more contacts need to add 
their velocity updates Av, to the same rigid body: this would cause a race condition 
where multiple threads might need to update the same memory value, something 
that can cause errors or indefinite/nondeterministic behaviors on the GPU hard- 
ware. Therefore, in order to parallelize Eq. (10), a parallel segmented scan algorithm 
[22] was adopted that operates on an intermediate reduction buffer (see Fig. 6); this 
method sums the values in the buffer using a binary-tree approach that keeps the 
computational load well balanced among the many processors. In the example of 
Fig. 6, the first constraint refers to bodies and 1 , the second to bodies and 2; mul- 
tiple updates to body are then accumulated with parallel a segmented reduction. 

Note that several other auxiliary kernels that have minimal impact on the compu- 
tation time are used to prepare pre-process data before the CCP starts, for example 
to compute Eq. (9). Also, to speed up the computation, matrices Df^, , Df^ and 
Df^ are not provided by the host; instead they are computed on the GPU using 
the data coming from the collision detection code, that is, s,,^. s,,b and n, . 

The following pseudocode shows the sequence of the main computational stages 
at each time step, which for the most part are executed as parallel kernels on the 
GPU (Table 2). 

Stages 1 and 10 can be avoided if one manages to keep all the data on the GPU, 
by letting the collision detection engine communicate with the CCP solver directly. 
Even if those memory transfers are executed only at the beginning and at the end 
of the CCP solution process, their impact on the overall simulation time might be 
significant. 
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Table 2 Pseudocode for tire CCP solver 



Stage 



Context 



Operations/kernels 



HOST 
Serial 



GPU 
Parallel on 

bodies 
GPU 
Parallel on 

contacts 

GPU 

Parallel on 
bodies 

GPU 

Parallel on 
contacts 



GPU 

Parallel on 
constraints 



7 


GPU 




Parallel on 




reduction 




slots 


8 


GPU 




Parallel on 




bodies 


9 


HOST 




Serial 


10 


HOST 




Serial 



Copy memory CPU^ GPU 

Copy contact and body data structures from host memory to 

GPU buffers 
Copy constraint data (residuals i), and Jacobians) into the 

constraint buffer 
Force kernel 
For each body, compute forces f(/"*, q"*, v"*), if any. Store 

these forces and torques into Fy and Cj 
Contact preprocessing kernel 
For each contact, given contact normal and position, compute in 

place the matrices DJ^.^ , Dj^^^ and Dj^^^ , then compute f), 

and the contact residual b, = {^<I>, (q),0, O} 
CCP force kernel 
For each body j , initialize body velocities: 

rJ'+'>=^;„-'F, and4+'»=/7/-'C, 
CCP contact iteration kernel 
For each contact i, do 



Y; 



r + l 



= n-r\y':^^-(^r),(DjY+h,)] 



Note that D,- v"" is evaluated with sparse data, using Eq. (11) 

Store Ay[J~ = y\~^ — y\ ^ in contact buffer. Use Eq. (12) to 
compute sparse updates Ar and A&i to the velocities of the 
two connected bodies A and B, and store them in the KiA 
and KjB slots of the reduction buffer 

CCP constraint iteration kernel 

For each constraint i, do 



y;:^ 



Y/,A ■ 



Note that V>I'f v'' is evaluated with sparse data, using Eq. (11) 
Store Ay- ^ = Y,' 6 ~ Y; i in contact buffer. Use Eq.(14) to 

compute sparse updates Ar and A&i to the velocities of the 

two connected bodies A and B, and store them in the Rja 

and KjB slots of the reduction buffer 
Segmented reduction kernel 
Sum all the Ar and Ai5 ternis belonging to the same body, in the 

reduction buffer. This may require a sequence of short 

kernels 
Body velocity updates kernel 
For each j body, add the cumulative velocity updates which can 

be fetched from the reduction buffer, using the index Rj 
Check convergence and repeat from stage 5 if convergence 

tolerance is not reached 
Copy memory GPU^ CPU 
Copy contact multipliers from GPU buffers to host memory, 

if interested in reaction forces 
Copy constraints multipliers from GPU buffers to host memory, 

if interested in reaction forces 
Copy rigid body velocities from GPU buffers to host memory 
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4 Numerical Experiments 

The largest simulation run to date using the CCP-based GPU solver discussed herein 
contained approximately 1 . 1 million bodies that interacted through frictional contact 
as illustrated in Fig. 7. This problem has a large number of small spheres made up 
of a material with high density. There is one large ball of low density mixed up 
with the rest of the spheres. The collection of balls is inside a three dimensional 
rectangular box that experiences a left-to-right harmonic motion. Because the large 
ball has lower density, it will eventually "float" on the spheres of high density as 
illustrated in the figure. This test, along with other simulations focused on tracked 
vehicle mobility on granular terrain are discussed in detail in [23,24]. An animation 
is available at [25]. 

In what follows the emphasis is on a comparison between the GPU-based solu- 
tion and a sequential approach used to solve a benchmark problem; i.e., the flow of 
a pebble bed nuclear reactor (Fig. 8). The fuel is encased in tennis-ball-size graphite 
spheres, each filled with nuclear fuel, specifically, coated UO2, with sub-millimeter 
diameter [26]. The approximately 400,000 pebbles are continuously recirculated or 
refreshed at a rate of about 2/min [27]. They are densely packed, at volume fractions 
approaching 0.6, and thus constitute a dense granular flow [28]. The center pebbles, 
represented with a different color, are moderator pebbles with comparable weight to 
the fuel pebbles, even if they do not contain particles of coated UO2. The reactor is 
cooled with a fast helium flow blown top-down that has negligible drag effects on 
the spheres when compared to gravitational forces [28]. Predicting the dynamics of 
the fuel pebbles in the pebble-bed reactor is important for its safety and gauging its 
performance [29]. 

To better understand the potential of parallel computing when employed to 
solve the problem at hand, both the sequential and parallel implementations draw on 
the same solution procedure detailed in Section 3. The only difference is that in one 
case the collision detection and the solution of the cone-complementarity problem 
are carried out sequentially, on the CPU, while for the parallel implementation these 



Fig. 7 Largest problem 
simulated to date, the system 
has about 1 . 1 million bodies 
that are shaken in a moving 
box 
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Fig. 8 Pebble bed nuclear 
reactor simulation 




two Stages, along with several other less computationally intensive steps of the so- 
lution methodology, are executed on the GPU. The benchmark problem was run for 
a set of 16, 32, 64, and 128 thousand particles. The sequential simulation was run 
on a single threaded Quad Core Intel Xeon E5430 2.66 GHz computer. For the par- 
allel version, the collision detection was implemented on a NVIDIA 8800 GT card, 
while the cone complementarity problem was solved on a Tesla C870. The integra- 
tion time step considered for this problem was 0.01 s. A number of 150 iterations 
was considered in the solution of the CCP problem. 

The dynamics of the pebble flow is as follows. First, the silo is closed and the 
balls are dropped from the top until the desired number of spheres is reached. The 
silo is subsequently opened, at which time the pebble flow commences. Shortly 
thereafter the flow reaches a steady state. At this time, the amount of time it 
takes to advance the simulation by one time step is measured. An average of this 
value obtained over several simulations is reported in Fig. 9. This process is car- 
ried out for both the GPU and CPU implementations for each of the four scenarios 
(16,000-128,000 bodies). The plot reveals that (a) both the CPU and GPU imple- 
mentations scale linearly with the number of bodies in the problem, and (b) the slope 
of the plot associated with the GPU implementation is smaller than that associated 
with the CPU solver. In fact, for 128,000 particles, the GPU solver is about 10 times 
faster than the CPU solver. As the interest is in multi-million body problems, this 
slope difference will result in significant reduction in simulation times. 

The plot in Fig. 10 provides the history for the amount of time it took the GPU 
solver to perform one integration step. In the beginning, when the balls are filling 
up the silo, there are few contacts and one integration time step is cheap. As the 
number of spheres in contact increases due to piling up of the bodies at the bottom 
of the silo, the time it takes to complete one time step increases. This is due to the 
gradual increase in the dimension of the CCP problem that needs to be solved. An 
artifact of the fact that only 150 iterations were considered in the CCP problem is 
the spurious increase (the peak) that is more pronounced for the 128,000 body case. 
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Fig. 9 Average duration in 
seconds for taking one 
integration time step (0.01 s) 
when taking 150 iterations in 
the CCP solver 



AnngelbtiJIha* 






t GPUAyoigpTblill^H 
■ CPtJivcngcTDIilTtiK 





Fig. 10 For the GPU solver, the plot shows how much time it took the solver to advance the sim- 
ulation by one time step. On the horizontal axis is shown the simulation time. After approximately 
3 seconds, when the flow reaches a steady state, each time step takes about the same amount of 
time, that is, approximately 9 s. These times were used in Fig. 9 to generate the lower curve (one 
with smaller slope) 



This was intentionally kept in order to demonstrate what happens if the CCP prob- 
lem is not solved accurately. Specifically, if the number of iterations is not enough to 
lead to the convergence of the CCP solver, the amount of penetration between bod- 
ies will increase leading to a larger number of contacts and therefore a larger CCP 
problem. However, as the bottom of the silo opens up, the bodies start falling and 
this regime is less challenging for the solver since the number of contacts suddenly 
decreases until reaching a steady state shortly after 3 s. At that point the amount of 
time required to advance the simulation by one time step stabilizes. Note that the 
average of this value over several simulations was used in Fig. 9 to generate the plot 
associated with the GPU-based solution. 
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5 Conclusions and Directions of Future Work 

Two observations justify today a fresh look at the solution of frictional contact prob- 
lems in large scale multibody dynamics simulation. First, existing graphics cards 
provide tremendous flop rate at very low costs. Second, there is a wide spectrum of 
real-life applications that lead to large frictional contact dominated multibody dy- 
namics problems that couldn't be solved in the past. The contribution of this paper is 
in presenting an approach for parallel computational multibody dynamics that can 
be used to tackle many of these applications. The frictional contact problem was 
formulated in a form that is suitable to be numerically solved in parallel and could 
take advantage of commodity parallel computing support available on the GPU. The 
collision detection and solution of the cone complementarity problem turned out to 
be the main computational bottlenecks of the simulation. Both these stages of the 
solution have been parallelized thus enabling the implementation of an approach 
that can tackle problems with more than one million bodies. 

Ongoing projects are aimed at: (a) increasing the size of the problem that can 
be solved by the proposed approach, (b) improving the speed of convergence of the 
CCP solver, (c) establishing the hardware infrastructure that can support the simu- 
lation of multibody dynamics problems with tens to hundreds of millions of bodies, 
and (d) performing an experimental validation of the simulation approach proposed. 
In terms of (a), current numerical experiments revealed that the 4 GB memory on 
the Tesla C1060 cards is exhausted for simulations that exceed 1.1 million bodies. 
A domain decomposition approach is anticipated to further increase this number 
by distributing a large problem to multiple GPUs using a spatial domain decom- 
position idea. The net outcome of this approach will be a pooling together of the 
memory resources of multiple cards. In terms of (b), it is anticipated that algebraic 
multi-grid methods will enable a reduction of the number of iterations required for 
convergence. Unless this issue gets addressed, problems with tens of millions of 
bodies might require prohibitively long convergence times that render the approach 
infeasible. In terms of (c), a 21 SP Teraflop cluster is currently assembled at the 
University of Wisconsin to support the domain decomposition approach described. 
The cluster will have one head node and six compute nodes, each of the latter with 
four Tesla C1060 NVIDIA GPUs. Finally, experimental validation is currently car- 
ried out both at macroscale, using 5 mm plastic particles, and microscale, using 100 
and 500 |xm glass spheres, respectively. In both cases, the experiments measure flow 
rates in silo replicas and a small hopper to validate the correctness of the simulation 
results. In addition to these four initiatives, there is a multitude of small projects that 
remain to be completed, the most important of which being the integration of the 
collision detection and CCP data structures. Currently, data is moved back and forth 
between the device and host right after the collision detection and before performing 
the CCP. This adds a significant overhead that once eliminated is anticipated to fur- 
ther improve the performance of the GPU solver. 
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Investigation of Gears Using an Elastic 
Multibody Model with Contact 



Pascal Ziegler and Peter Eberhard 



Abstract The classical approach to simulate contacts between gears is to use rigid 
body models coupled with a parallel spring damper combination. However, these 
models had been developed for properly meshing gears with smooth contacts and 
cannot cover wave propagation caused by hard contacts or impacts. Moreover, as 
they are based on the assumption of rigidness, often light weight designs, resulting 
in very compilable gear bodies, cannot be considered appropriately. To evaluate how 
appropriate these rigid body models are to simulate impact forces, a very detailed fi- 
nite element model is used to simulate several impacts and the results are compared 
to simulations with a rigid body model. The results reveal that for compilable gear 
bodies, there exist dynamic effects that considerably affect contact forces and mo- 
tion and that these effects cannot be covered by rigid body models at all. Hence, a 
flexible model is imperative to precisely simulate impact forces. To reduce integra- 
tion time, we present a modally reduced elastic multibody model including contact 
that allows very precise simulations in reasonable time. For the contact calculations 
a node-to-segment penalty formulation is introduced and is integrated using central 
differences. Even though the elastic model is a reduced model, it is still of huge 
size, as any node on any flank is a potential contact node. Also, the transformation 
data between modal and nodal coordinates must be accessible during integration. 
To reduce the required amount of memory a coarse collision detection is introduced 
that allows to dynamically reload only the transformation required in the current 
integration step. This approach allows very precise simulations of contacts between 
gears with integration times about 400 times faster than for associated finite ele- 
ment simulations. At the same time the model is robust and fast enough to allow the 
simulation of many contacts and many revolutions. To validate this approach basic 
experimental investigations with simple impact bodies have been carried out. The 
results from these experiments and related simulations agree very well. 
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1 Introduction 
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The necessity to simulate systems with gears is mainly driven by two objectives. 
First, to simulate the dynamical interaction of gears with other system components 
or to determine joint loads and, second, to calculate stresses in gears. The first objec- 
tive is often approached by using rigid body models, like [5], since they are simple 
to set up and fast due to their small number of degrees of freedom. 

The investigation of stresses requires an elastic model and very often finite el- 
ement models are used in this context. To obtain meaningful stresses one has to 
specify realistic load cases and very often, these load cases are calculated using the 
aforementioned rigid body models in global overall dynamic systems. 

However, if the gears are very compilable the separation of both calculations may 
not be valid anymore since the elasticity of the gear may have significant influence 
on the motion and thus on the contact forces, too. Hence, for compilable gears an 
elastic model is necessary right from the start to get physically correct and usable 
results. 



2 Classical Models 

To quantify the influence of the elasticity of gears on contact forces, a gear pair from 
a commercially used real gear train is investigated for several contacts. The gear pair 
consists of a crank shaft gear and an idler gear, see Fig. 1. This particular gear pair 
is mainly chosen due to the very compilable idler gear. The impact-like contacts 
typical for gear trains are likely to excite higher eigenfrequencies than those that 
are usually considered. Each gear is supported by a perfect rotational joint and both 
gears are assembled to initially touch in their pitch points, while the idler gear is 
given an initial rotational velocity of ft;,o. 




Fig. 1 Finite element models of crank shaft gear and idler gear 
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2.1 Finite Element Model 

Figure 1 shows the finite element meshes of the crankshaft gear (left) and the idler 
gear {right). Both gears are meshed using linear hexahedral elements only. To re- 
present the flank profiles precisely, the profiles are modeled by copying the actual 
production process during pre-processing, see [8]. In the contact areas an element 
edge length of about 0.2 mm is used to guarantee a good resolution of contact 
stresses and contact forces. In the gear body, however, the edge length is about 
8 mm to be able to account for wave phenomena up to a frequency of 80 kHz, see 
[1]. This results in a total number of 145,000 elements. 

The rotational joints are modeled by kinematically constraining all bore nodes 
to a reference node in the gear center that has only one rotational degree of free- 
dom. For the material a fully elastic linear material model with Young's modulus 
E = 210,000 Nmm^, density p = 7,850 kg/m^ and Poisson ratio ji = 0.3 is used. 
Damping is neglected. The finite element model is set up in Abaqus and integrated 
using the explicit central-difference based algorithm Abaqus/Explicit. The contact 
is modeled using the kinematic contact algorithm provided by Abaqus/Explicit. 



2.2 Rigid Body Model 

The simple multibody model for the gear pair consists of two rigid bodies, each 
having one rotational degree of freedom. The inertia properties used here are derived 
from the undeformed configuration of the finite element model and can, therefore, be 
considered to be very precise. Both bodies are coupled with a typical spring-damper 
combination as included in many commercial programs such that the associated 
forces act on the base radii rj,i and rj,2, respectively, see Fig. 2. 

The backlash element in series to the spring element accounts for the unilateral 
contact. Damping is fully neglected, hence d = and the equations of motion 
follow as 

li^Pi + rtiC irb\(p\ + rb2(P2 - e(t)) = , / = 1, 2 (1) 




Fig. 2 Rigid body model of a gear pair 
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where the backlash function e(t) reads 
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e(t) = 



for (rti(pi + rtifi) > 0, 
for (rhi(Pi + rhjfi) < 0. 



(2) 



A good choice for the stiffness c is very important for good results, since this 

parameter describes all elasticities of the gear. According to the industrial standard 

DIN 3990-1, here, the so called single stiffness c', see [6], is used. Even though 

there exist industrial standards like [3] or [4] to calculate the single stiffness, the 

values can vary significantly, see [14]. Therefore, the single stiffness is determined 

from a static finite element analysis using the finite element model described above 

and follows as 

N 

c =c' = 5.68 , (3) 

/xmmm 

see [14] for details. The rigid body model is set up using a commercial multibody 
software and the provided gear contact force element is used and the parameters for 
the force routine are chosen such that they represent the model above. The model is 
integrated using the implicit Radau 5 scheme. 



2.3 Comparison 

Figure 3 shows the simulated contact forces for an initial rotational velocity of 
(W/o = 100°/s. Apparently, the contact forces calculated with the rigid body model 
are much higher than those calculated with the finite element model. In addition, the 
contact forces from the finite element model shows strong vibrations during contact. 
Considering the finite element model to be very precise, hence a reference solution, 
the rigid body model simply gives completely wrong results. 
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Fig. 3 Contact forces for the rigid body model and the finite element model 
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Fig. 4 Rotational velocities for the rigid body model and the finite element model 
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Fig. 5 Several elastic effects: Polygonalization and torsional oscillation (left) and multiple pairs 
of cogs in contact 



The rotational velocities, shown in Fig. 4, show considerable differences, too. 
Particularly at times between impacts, the rotational velocities from the finite ele- 
ment simulations show a very noticeable vibration that can not be observed from 
the rigid body model. 

The differences in contact forces and rotational velocity are direct results of the 
elasticity of the gear body that cannot be sufficiently considered in a rigid body 
model even with refined force laws. During impact, the outer gear rim deforms 
under the contact force while some of the energy is stored in the deformation. This 
deformation can be seen from Fig. 5 (left). 

One result of this deformation is that the contact force decreases shortly to in- 
crease again when the deformation energy is restored. A second result is that due to 
deformation the pitch changes to such an extend that for the investigated contact 
situation three pairs of cogs come into contact, while for a rigid body, theoret- 
ically only one pair is possible, see Fig. 5 (right). That explains that the contact 
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forces of the rigid body model are consistently too high. Another consequence of 
the deformation, shown in Fig. 5 (left) is the rotational vibration between contacts. 
The mass of the idler gear is mainly concentrated in the hub rim and the outer rim 
while the inner gear body is very compilable. Therefore, the impact excites an out 
of phase torsional vibration of the outer rim against the hub rim that can particularly 
be seen between contacts. 

The results clearly show that for compilable gears an elastic model is absolutely 
necessary for precise simulations of contact forces, but regarding the integration 
time of about 8 days on a Pentium 4 for the results of the finite element model pre- 
sented above also shows that the simulation of several revolutions is not practically 
feasible using finite elements. 



3 Elastic Multibody Model 

To overcome the large integration times of the finite element model while still hav- 
ing a numerically accurate model, the floating frame of reference approach, see 
[11], is chosen. Due to the formulation of the deformation with respect to a refer- 
ence frame that is allowed to undergo large nonlinear motions, the large rotation of 
the gears can easily be described while still the deformation can be assumed to be 
linear, thus allowing an efficient modal order reduction. Here, a modally reduced 
description is used. The equations of motion read 

M(z/)z// -htt,(z/,z//)-he(z/,z//)-hc(z/,z//) =0, (4) 

where M is the mass matrix, h^, are generalized inertial forces, h^ are generalized 
internal forces and z/ and z// are the generalized position and velocity state vari- 
ables, respectively. All external forces and torques, including contact forces, are 
given by he. The generalized position and velocity variables are chosen as 

zi = [pa qf (5) 



z// = ['*' w q] with z/ = 



[p 


aq] 
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E 


? 





OJ^i 











E 



z// = J 1 (z/)z//, (6) 



where p is the absolute position of the reference frame with respect to the reference 
frame, a are appropriate rotation parameters to describe the orientation of the refer- 
ence frame and q are the elastic degrees of freedom. The absolute velocity and the 
rotational velocity are denoted by v and co, respectively. The relationship between 
the position and velocity states is given by the generalized Jacobian matrix J that 
mainly consists of the rotational Jacobian matrix J^^ that depends on the chosen 
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rotation parameters a and the skew symmetric matrix 'p that introduces the cross 
product in matrix notation and is derived from the translational positions p, see [9]. 
Due to the modal formulation, the elastic states q are modal coordinates. 



3.1 Contact Algorithm 

For the contact force calculation a general node-to-surface approach is used. How- 
ever, if possible, gears are designed with a prime number of cogs, hence the number 
ofpossible contact pairs for two gear with zi andz2 number of cogs is «„,nv = 2z\Z2- 
Obviously, a collision detection looping over all flank nodes and elements would be 
numerically extremely costly, as due to the modal description, a transformation from 
modal coordinates to nodal coordinates would be necessary for every flank node. 
Under normal operations typically at most two to three flanks are in contact at the 
same time. Therefore, a coarse collision detection reducing the node-based contact 
force calculation to these few flanks is unavoidable for performance reasons. 



3.1.1 Coarse Collision Detection 

The coarse collision detection is based on index nodes. Each cog is referenced by a 
separate index node that is approximately located in the center of each cog, see Fig. 6 
{left). The absolute positions of the index nodes are calculated in every integration 
step. Then, the center node, i.e. the index node closest to the center of the associated 
gear is determined and based on this center node a small number Zs of cogs on the 
left and right are considered as possible contact candidates, see Fig. 6 {right). For 
most designs Zs = 2 is sufficient. 

The benefit from this procedure is not only a reduction of possible contact nodes, 
but also a significant reduction of the size of the transformation matrix needed to 
transform from modal coordinates to absolute nodal positions. Here, a dynamic 
reloading scheme is used to dynamically load only the transformation data necessary 




Fig. 6 Coarse contact determination using index nodes (left) and contact candidates from these 
index nodes {right) 
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to transform the index nodes as well as the current nodes of the contact candidates. 
The transformation matrix is only updated if the center node changes. 

3.1.2 Fine Collision Detection 

The fine collision detection is based on a general node-to-surface contact, see Fig. 7. 
The goal is to find the contact point of the slave node Q on the surface spun by the 
nodes Pi of the surface of the master element. 

This is done by finding the point Xc with the smallest distance to the slave node 
Q which can be formulated by 



ax 
ax 



{^c,nc)-[i\-^{^c-r]c)] =0, 



i^c,Tlc)-[q-^{^c-r}c)] = 0. 



The distance g between Q and Xc is calculated as 



g = n-{x{^c,r]c)-q), with n 






(7) 
(8) 

(9) 



and denotes the gap or penetration of the slave node, dependent on the sign of g, see 
[12]. For the hexahedral elements used here, the shape function reads 



1^ 



(10) 



/ = ! 



accordingly, (7) and (8) are nonlinear and are solved using Newton-Raphson 
iteration. 




Fig. 7 Fine collision detection as iterative search for contact point 
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mid-plane cross section 



polygonal contact patch 



Fig. 8 Reduction of the three-dimensional contact problem to a two-dimensional polygonal con- 
tact problem for perfectly aligned spur gears 



When the penetration of the slave node is determined, the nodal contact forces 
can be calculated using a penalty approach. Then, the nodal contact force for the 
slave node follows directly from its penetration. The nodal forces acting on the four 
master nodes representing the master surface follow from the participation factors 
^c and rjc of the determined contact point Xc according to the shape function. When 
all nodal contact forces are calculated the overall nodal contact force vector can be 
assembled and transformed to modal coordinates. 

For perfectly aligned spur gears the three-dimensional contact situation can be 
simplified to a two-dimensional contact problem, see [13]. Then, only the nodes in 
the center plane of the contact candidates are considered and the contact problem is 
reduced to a polygonal contact problem, see Fig. 8 that can be solved using the ray- 
crossing algorithm, see [7]. However, in the implementation full three-dimensional 
contact is used. 

After determining which slave nodes have penetrated the polygon defined by the 
master nodes, the slave nodes are projected on the associated segments. Again, the 
nodal contact forces for the slave nodes are calculated using a penalty approach and 
the nodal master forces are calculated according to their participation factors. When 
all nodal forces are determined for the polygon the contact forces are appropriately 
distributed to all nodes on the contact line intersecting the respective contact nodes. 
Since the number of contact nodes is reduced tremendously, this two-dimensional 
fine collision detection is far more efficient than the three-dimensional cases, how- 
ever, it may not be as precise as the three-dimensional approach. 



3.2 Integration 



For the calculation of the contact forces many transformations to absolute positions 
are necessary. Therefore, in order to increase numerical efficiency, the equations of 
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motion (4) are formulated on acceleration level. With differentiation of equation (6) 
the equations of motion follow as 

J^MJz/ - J^ [Mjz> - ho, (z/ , z/) - he (z/ , z» - he (z/ , z/)] 
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0. (11) 



The second order differential equation (1 1) is integrated using the central differ- 
ence scheme, see [2] 



1 



z/,« = -r-^ (z/,«+i - 2z/,„ + z/,„_i) 



(12) 



and the new displacement z/ „+i follows as 

z/,„+i = At^M~^ {\\^ + he + he) + 2z/,„ -z/,„_i, with M = J^MJ. (13) 

The central difference method is a conditionally stable integration scheme and the 
integration step size has to be smaller than the critical time step Atcni = 2/o),„ax- 

In (13) the inverse of the state dependent mass matrix is needed which has to 
calculated in every integration step. For a modal model with properly scaled eigen- 
modes, the sub-matrix Mee in (11) is an identity matrix. This allows to use the 
structure of the mass matrix for an efficient calculation of the inverse 



M-i 



M sym. 

[Met Mer] M I + [Me, Mer] M [Met Mer]^ . 



(14) 



with 



M 



( 



'Mtt Mj; 

Mrr Mrr 



[Mer Mer]^ [Mer Mer] 



(15) 



Now the inverse of the mass matrix can be calculated by only inverting the 6x6 
matrix and a few matrix multiplications. This is not only much faster, but also much 
more precise than an iterative inversion of the full mass matrix. 

Constraints, like the ideal rotational joints in Sect. 2 can be incorporated by 
adding constraint equations to (11), eventually resulting in a differential-algebraic 
equation. However, this would destroy many of the measures taken to improve per- 
formance. As all constraints used with the models from Sect. 2 are scleronomic, 
they can easily be integrated into the equations of motion for comparison with these 
models. In more complicated cases, a penalty approach is used to model constraints. 
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In the following, simulation results for two gear pairs are given. The first gear pair 
is the pair of spur gears presented in Sect. 2. The simulation results from the elastic 
multibody model are compared to the finite element solution and the numerical effi- 
ciency is evaluated on the basis of these simulations. The second gear pair consists 
of two helical gears with cylindrical gear bodies. Besides contact forces and rota- 
tional velocities, joint forces are investigated and the results are compared to a spur 
gear pair of the same size. 



4.1 Spur Gears 

Figures 9 and 10 show a comparison between the results of the finite element model 
and the results obtained from the modally reduced elastic multibody model for the 
spur gear pair. For the simulations shown here about 1,500 modes per gear were 
used. This large number at first seems unnecessary, but as shown in [14], modes up 
to an eigenfrequency of about 80 kHz are necessary to get correct contact forces. 
This is mainly due to the local impulsive contact with very local deformations in a 
small contact area, see [10]. 

Both contact forces and rotational velocities agree very well to the finite element 
solution. For the first few contacts, the contact forces almost perfectly match the 
finite element solution in shape, magnitude and particularly the number of cogs 
simultaneously in contact. The rotational velocities, too, agree very well and show 
the torsional vibration of the outer rim against the hub during the free flight phase. 
The elastic multibody model is able to describe the elastic effects efficiently and 
very precise results can be obtained. 




10 12 

time in ms 

Fig. 9 Contact forces for the elastic multibody model and the finite element model 
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10 12 

time in ms 



Fig. 10 Rotational velocities for the elastic multibody model and the finite element model 



Table 1 Requirements for integration time, disk space and memory for 
the spur gear pair and 13 contacts using three different models 





Integration 


Disk 


Memory 


Model 


time 


requirement 


requirement 


MBS 


5s 


« 0.1GByte 


< 0. 1 GByte 


FEM 


6,98,400 s 


Ri 1 GByte 


Ks 1 GByte 


EMBS, 2d-contact 


683 s 


Ri 8 GByte 


Ks 0.5 GByte 


EMBS, 3d contact 


1,834s 


== 8 GByte 


ss 2 GByte 



4.2 Numerical Efficiency 



The results obtained from the elastic multibody model are almost as precise as 
the reference finite element solution. However, the motivation for the elastic multi- 
body model is to get a numerically much more efficient description of gear contact. 
To evaluate the numerical efficiency, the integration time and several hardware re- 
quirements are summarized in Table 1 . The computation times were measured on a 
Pentium 4 computer with 2 Gigabytes of RAM. 

Regarding the integration time and the time for setting up the model setup, the 
rigid body model is fastest. Though, as shown, the results are completely wrong 
for very elastic gears. The finite element model considers all elastic effects and, 
due to the fine discretization, can be regarded to give very precise results. On the 
other hand, the integration time is enormous. The elastic multibody model needs 
pre-processing. This pre-processing includes geometrical analyzes to extract con- 
tact nodes and surfaces from the finite element mesh and, most expensive, a modal 
analysis of each gear. However, the modal analysis only has to be carried out once 
per gear and is stored to disk, explaining the large disk requirements. In all subse- 
quent simulations, the pre-processed data can be used which shows in the integration 



Investigation of Gears Using an Elastic Multibody Model with Contact 321 

Table 2 Time distribution during one integration step (left) and one step of the 3d- 
contact calculation (right) for the elastic multibody model 



Task 


Percentage 

77% 


Task 


Percentage 


Contact calculation 


Transformation of contact nodes 


41% 


Integration 


19% 


Fine collision test 


27% 


Miscellaneous 


4% 


Coarse collision test 


25% 






Miscellaneous 


4% 






Transformation of nodal forces 


3% 



time, being about a factor of about 1 ,000 faster than the finite element analysis when 
the 2d-contact for spur gears is used and still about 400 times faster using the 3d- 
contact. 

To reveal the most costly parts in an integration step for the elastic multibody 
system. Table 2 (left) shows the percentage fraction of the major parts of one inte- 
gration step. For reasonably discretized flank surfaces, the most costly parts of one 
integration step are the integration itself and the contact calculation. The times nec- 
essary for one contact calculation step are broken down in Table 2 (right) revealing 
three big blocks consuming almost 90% of the total time altogether. 

The coarse collision detection accounts for about a quarter of the time needed for 
one contact calculation step. However, since the values represent average elapsed 
times evaluated from the considered example, these 25% partially include the time 
needed to load the transformation data from disk as well. The by far most costly 
part is the transformation of the contact nodes that strongly depends on the dis- 
cretization. Even though the model is fully implemented in Matlab, here, a further 
optimization is hardly possible, since the transformation matrices are already dense 
and Matlab uses a highly optimized basic linear algebra kernel. The fine collision 
detection accounts for about 27% of a contact step. Here, the Matlab implementation 
is optimized by implementing the fine collision detection in C. 



4.3 Helical Gears 

The second investigated gear pair is a helical gear pair with helix angle P = 15°, 
Zi = 18, andz2 = 37, see Fig. 11. 

Again, the two gears are mounted with ideal rotational joints and both gears 
are given an initial rotational velocity of CDio = cojo = 5rad/s. However, this time 
the rotational joints for the elastic multibody model are modeled using a penalty 
approach. This easily allows to calculate hub forces without explicitly having to 
solve constraint equations. 

Figure 12 (left) shows the contact forces for three contacts for the finite element 
and the elastic multibody model. Once more, the calculated contact forces agree 
very well. 
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Fig. 11 Helical gear pair with cylindrical gear bodies 
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Fig. 12 Contact forces for the elastic multibody model and the finite element model for helical 
gear pair 



The rotational velocities presented in Fig. 12 (right), too, show a very good 
agreement during all three contacts. However, the dynamical effects discussed above 
are not as apparent as for the spur gear pair for two reasons. First, the gear body of 
both gears are almost cylindrical and, therefore, very rigid considerably reducing 
elastic effects. 

Second, the helix angle results in an increased overlap ratio which in turn leads 
to two flanks in simultaneous contact for all three contacts. In contrast, as shown in 
Fig. 12 (left), a similar spur gear pair with helix angle fi = 0° only shows one flank 
in contact at a time. Therefore, the contact forces are much higher and the contact is 
shorter and harder for the spur gear. 

Another effect of the helix angle is that axial reaction forces are introduced, as 
shown in Fig. 13 (right) for the pinion during the first contact. Clearly, for elastic 
multibody models of the helical gear pair an axial reaction force can be observed 
that agrees very well to the finite element model again, whereas the spur gear pair 
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Fig. 13 Comparison of joint forces between the finite element model and the elastic multibody 
system and spur and helical gear 



only shows radial components. More surprisingly, however, the reaction forces of 
the spur gear pair and the helical gear pair show strong vibrations even though the 
contact forces as well as the rotational velocities show almost no vibrations. That 
shows that even for fairly rigid gears an elastic approach is necessary to compute 
precise results. 



5 Experimental Results 



To validate the proposed elastic multibody model, basic experimental impact inves- 
tigations have been carried out. For these investigations, a simple impact body with 
cuboid shape is impacting on a spur gear. The gear used for the experimental inves- 
tigations has z = 37 teeth of modulus m = 4.9mm and a cylindrical gear body. 
To reduce the experimental complexity, we limit ourself to stationary gear wheels, 
which tremendously reduces the measurement setup, particularly the alignment of 
impact body and gear wheel. 

To guarantee a very precise guidance of the impact cuboid, a guiding slide rail 
is used. To allow reproducible experimental conditions and negligible friction, the 
slide rail is supplied by pressured air that generates an air cushion upon which the 
cuboid hovers almost frictionless, see Fig. 14. 

Furthermore, some air exhausts from the gap between cuboid and slide rail, 
producing a self-aligning effect. The slide rail is mounted on a precision rotary stage 
that allows to adjust the alignment angle P with a precision of less than 1/1000°. 
The rotary stage itself is mounted on a frame that allows to adjust the alignment an- 
gles a and y, see Fig. 15 (left). The complete experimental setup is shown in Fig. 15 
(right). 

Generally, the direct measurement of contact forces is difficult. Therefore, flank 
velocities, as well as strains are measured in close vicinity to the contact area and 
compared to corresponding simulation results. Laser-Doppler vibrometers are used 
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Fig. 14 Schematic drawing of the air supplied slide rail 




Fig. 15 Rotary stage and supporting frame for cuboid impact body (left) and overall experimental 
setup (right) 
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Fig. 16 Flank velocities in the middle of the rear flank (left) and cuboid velocity (right) for 
experiment, finite element model and elastic multibody model 



to measure the flank velocity on the back side of the impacted tooth and strain 
gauges are applied in the dedendum. Since the interesting frequencies are up to 
80 kHz, DC amplifiers instead of carrier frequency amplifiers are used, to directly 
measure the mistune of the bridge. 

Figure 1 6 shows the flank velocity measured in the middle of the rear flank and 
the cuboid velocity for one experiment together with the simulation results from the 
finite element model and the elastic multibody model. Apparently, the results agree 
very well. 
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Fig. 17 Position of the strain gauges in the dedendum and strain signals for one impact for 
experiment and finite element model 



The measured and simulated strains in the dedendum are shown in Fig. 17. The 
signal of the strain gauges turns out to be very noisy which is mainly due to the use 
of DC amplifiers. To reduce the noise ten measurements have been averaged. Here 
too, a good agreement between simulation and experiment can be observed. 

Comparing the experimental results with simulations, apparently, the finite ele- 
ment model and the elastic multibody model give very good results. Both the surface 
velocities on different teeth, as well as strains in the dedendum can be calculated 
precisely and the good agreement between simulation and experiment even holds 
for a long time after the contact. For the basic impact conditions described above, 
the elastic multibody model can be regarded as validated. 



6 Conclusions 



A very detailed finite element model and a classical rigid body model have been 
used to investigate several contacts between a gear pair taken from a real technical 
application. It was shown that the elasticity of the gear body has a significant influ- 
ence on the contact forces as well as on the global motion and that simple rigid body 
models cannot be used to calculate precise contact forces. Instead, a fully elastic 
approach is necessary. 
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To avoid the tremendous numerical effort needed to solve the full finite element 
model, a modally reduced elastic multibody model with contact has been presented. 
However, several implementation issues must be considered carefully to get a nu- 
merically efficient implementation and to reduce the memory requirements needed 
for contact node transformations. Particularly a coarse collision detection including 
a dynamic reloading scheme is important. Besides a general tree-dimensional node- 
to-segment approach, a very efficient contact algorithm for perfectly aligned spur 
gears is used. In combination with an explicit integration scheme which exploits the 
structure of the mass matrix for its inversion, this elastic multibody model allows 
very precise simulations in short time compared to finite element solutions. The 
presented approach is general enough to also simulate helical gears and simulation 
results for a helical gear pair were shown and compared to results of a similar spur 
gear pair. 

In order to validate the numerical models, basic experimental impact investi- 
gations have been carried out, where a stationary gear wheel was impacted with 
simple impact bodies and flank velocities as well as strains in the dedendum have 
been measured. Comparing simulation and experiment, a very good agreement was 
found. 
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